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Search Results - Record(s) 1 through 50 of 53 returned. 
□ 1. Document ID: US 20040064261 Al 

L3: Entry 1 of 53 File: PGPB Apr 1, 2004 

PGPUB- DOCUMENT-NUMBER : 20040064261 
PGPUB- FILING-TYPE : new 

DOCUMENT-IDENTIFIER: US 20040064261 Al 

TITLE: Technique for analyzing arrayed signals using quantum expressor functions 
PUBLICATION-DATE: April 1, 2004 
INVENTOR- INFORMATION: 

NAME CITY STATE COUNTRY RULE-47 

Gulati, Sandeep La Canada CA US 

US-CL-CURRENT: 7^2/19; 702/22 
ABSTRACT: 

A technique for determining events of interest within an output pattern generated 
from a detected image of an array of detectors where the output pattern comprises 
signals associated with noise, and signals associated with the events of interest 
which have intensities both greater and less than intensities of signals associated 
with noise. Quantum resonance interf erometry is utilized to amplify signals 
associated with the events of interest having an intensity lower than the intensity 
of signals associated with noise, to an intensity greater than the intensity of the 
signals associated with noise to generate a modified output pattern. Once the 
desired signals are amplified, the technique determines which signals within the 
modified output pattern correlate with events of interest thus permitting a 
determination to be made whether a certain event of interest has occurred. 

L3: Entry 1 of 53 File: PGPB Apr 1, 2004 



DOCUMENT-IDENTIFIER: US 20040064261 Al 

TITLE: Technique for analyzing arrayed signals using quantum expressor functions 



Summary of Invention Paragraph : 

[0008] Accordingly, it would highly desirable to provide an improved method and 
apparatus for analyzing the output of the DNA microarray to more expediently, 
reliably, and inexpensively determine the presence of any medical conditions or 
concerns within the patient providing the DNA sample. It is particularly desirable 
to provide a technique that can identify mutation signatures within dot 
spectrograms even in circumstance wherein the signal to noise ration is extremely 
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low. It is to these ends that aspects of the invention are generally drawn. 



■am 



Citation 



Reference Sequences 



□ 2. Document ID: US 20040063136 Al 

L3: Entry 2 of 53 File: PGPB 



Apr 1, 2004 



PGPUB-DOCUMENT-NUMBER: 2 0 04 0 063136 
PGPUB- FILING-TYPE : new 

DOCUMENT-IDENTIFIER: US 20040063136 Al 

TITLE: Repeatable software-based active signal processing technique 
PUBLICATION-DATE: April 1, 2 004 



INVENTOR-INFORMATION: 

NAME CITY 

Gulati, Sandeep La Canada Flint ridge 



STATE COUNTRY 
CA US 



RULE- 47 



US-CL-CURRENT : 435/6 



ABSTRACT: 



A technique is disclosed that is useful for determining the presence of specific 
hybridization expression within an output pattern generated from a digitized image 
of a biological sample applied to an arrayed platform. The output pattern includes 
signals associated with noise, and signals associated with the biological sample, 
some of which are degraded or obscured by noise. The output pattern is first 
segmented using tessellation. Signal processing, such as interf erometry, or more 
specifically, resonance interf erometry, and even more specifically quantum 
resonance interf erometry or stochastic resonance interf erometry, is then used to 
amplify signals associated with the biological sample within the segmented output 
pattern having an intensity lower than the intensity of signals associated with 
noise so that they may be clearly distinguished from background noise. The improved 
detection technique allows repeatable, rapid, reliable, and inexpensive 
measurements of arrayed platform output patterns. 

L3: Entry 2 of 53 File: PGPB Apr 1, 2004 



DOCUMENT-IDENTIFIER: US 20040063136 Al 

TITLE: Repeatable software-based active signal processing technique 



Summary of Invention Paragraph : 

[0010] Accordingly, it would highly desirable to provide an improved method and 
apparatus for analyzing the output of the DNA microarray to more expediently, 
reliably, and inexpensively determine the presence of any conditions within the 
patient providing the DNA sample. It is particularly desirable to provide a 
technique that can identify mutation signatures within dot spectrograms even in 
circumstance wherein the signal to noise ration is extremely low. It is to these 
ends that aspects of the invention are generally drawn. 
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□ 3. Document ID: US 20040061702 Al 

L3: Entry 3 of 53 File: PGPB Apr 1, 2004 

PGPUB-DOCUMENT-NUMBER: 20040061702 
PGPUB-FILING-TYPE: new 

DOCUMENT-IDENTIFIER: US 20040061702 Al 

TITLE: Methods and system for simultaneous visualization and manipulation of 
multiple data types 

PUBLICATION-DATE: April 1, 2 004 

INVENTOR-INFORMATION : 

NAME CITY STATE COUNTRY RULE- 47 

Kincaid, Robert Half Moon Bay CA US 

US-CL- CURRENT : 345/440 



ABSTRACT : 

Software systems, methods and recordable media for organizing and manipulating 
diverse data sets to facilitate identification, trends, correlations and other 
useful relationships among the data. Extremely large data sets such as microarray 
data and other biological data are graphically displayed and sorted in an effort to 
develop visual similarities, correlations or trends that can be seen by a user of 
the present invention. Various schemes for graphical representations of the data, 
as well as sorting schemes are provided, including sorting schemes performed 
relative to pseudo-data vectors. 

L3: Entry 3 of 53 File: PGPB Apr 1, 2004 



DOCUMENT-IDENTIFIER: US 20040061702 Al 

TITLE: Methods and system for simultaneous visualization and manipulation of 
multiple data types 

Detail Description Paragraph : 

[0106] The column, row and manual sorting procedures described above can be useful 
in identifying correlations, trends and other relationships among the data in some 
instances. However, when dealing with large volumes of experimental data, such as 
microarray data sets or protein or other molecular data sets, the data sets are 
often sufficiently "noisy" that it is often difficult to find meaningful 
correlations by simply sorting a single column (e.g., a single array) or a single 
row (e.g., a single gene). When experimental data such as these are measured by 
very low level signals, there may be a lot variation in the measured values from 
experiment to experiment and they are inherently "noisy N . Microarrays are generally 
noisy due to a number of experimental variances. Microarrays are generally 
qualitatively reproducible, but the individual measurements will still show quite a 
bit of variance. Thus, if a sort is performed on the basis of a single or 
individual array, slightly different ordering results are observed, as compared to 
the same sort performed on an array which is already known to be similar. These 
differences may even occur when a sorting procedure is performed on two different 
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arrays representing the same experiment (i.e., a replicated experiment) due to 
differences in noise levels between the two arrays. To address these problems, the 
present invention further provides the capability of performing similarity sorting, 
which includes the ability to sort the data set by row or column similarity. 



□ 4. Document ID: US 20040053277 Al 

L3: Entry 4 of 53 File: PGPB 



Mar 18, 2004 



PGPUB-DOCUME NT-NUMBER : 20040053277 
PGPUB- FILING-TYPE : new 

DOCUMENT-IDENTIFIER: US 20040053277 Al 

TITLE: Strong gene sets for glioma classification 

PUBLICATION-DATE: March 18, 2004 



I NVE NT OR - 1 N FORMAT ION: 

NAME CITY 

Zhang, Wei Houston 

Fuller, Greg Houston 

Dougherty, Ed College Station 

Hess, Kenneth Houston 



STATE 

TX 

TX 

TX 

TX 



COUNTRY 

US 

US 

US 

US 



RULE- 4 7 



US-CL-CURRENT: 435/6; 435/7.23 



ABSTRACT : 



The present invention provides a number of gene markers whose expression is altered 
in various gliomas. In particular, by examining the expression these markers, one 
can accurately classify a glioma as glioblastoma multiforme (GM) , anaplastic 
astrocytoma (AA) , anaplastic oligodendroglioma (AO) or oligodendroglioma (OL) . The 
diagnosis may be performed on nucleic acids, for example, using a DNA microarray, 
or on protein, for example, using immunologic means. Also disclosed are methods of 
therapy . 

L3: Entry 4 of 53 File: PGPB Mar 18, 2004 



DOCUMENT-IDENTIFIER: US 20040053277 Al 

TITLE: Strong gene sets for glioma classification 



Detail Description Paragraph : 

[0260] The inventors used a novel method to find both strong classifiers and strong 
features. This method is briefly described in the Materials and Methods section and 
is explicated in more detail in (Kim et al . , 2002). This algorithm considers the 
inherently variable or "high -noise " nature of microarray measurements and mimics 
this fuzziness by adding noise to the sample sets. The basic idea is that if the 
data are deliberately made "worse" and classifier genes can still be identified, 
then these genes are very likely to be robust. 



http://westbrs:9000^in/gate^ 4/11/04 



Record List Display 



Page 5 of 60 



Full Title Citation] Frc»nf"}""Review j Classification ] Date [Reference Sequences Attachments | Claims]" KVulC ] Draw D 



□ 5. Document ID: US 20040038206 A 1 

L3: Entry 5 of 53 File: PGPB Feb 26, 2004 

PGPUB-DOCUMENT-NUMBER: 2 0040 03 82 06 
PGPUB- FILING-TYPE : new 

DOCUMENT-IDENTIFIER: US 20040038206 Al 

TITLE: Method for high throughput assay of genetic analysis 

PUBLICATION-DATE: February 2 6, 2004 

INVENTOR-INFORMATION: 
NAME CITY 
Zhang, Jia Glendora 
Li, Kai Glendora 

US-CL-CURRENT: 435/6 



ABSTRACT: 

Methods for high throughput assay of genetic analysis are provided. Genetic 
materials of either DNA or RNA are used as the template for primer extension using 
target specific primers. Following primer extension, the extended products with 
labeled nucleotides integrated are kept on the solid support and used for 
visualization and detection. As compared to other methods for genetic assay, this 
method is quick, reliable, and compatible with analysis of several genetic analysis 
including polymorphism, gene expression profiling, and sequencing. 

L3: Entry 5 of 53 File: PGPB Feb 26, 2004 



DOCUMENT-IDENTIFIER: US 20040038206 Al 

TITLE: Method for high throughput assay of genetic analysis 
Detail Description Paragraph : 

[0040] Prior to this invention, both enzymatic primer extension and oligonucleotide 
hybridization have been used in the analysis of polymorphism as mentioned above. 
These two strategies seem very different and could not been integrated in one 
method. Primer extension followed by gel electrophoresis is very reliable, which 
gives a clear answer with yes or no. But this method has several limitations. 
First, it is not easy to apply to multiple polymorphism sites and in high 
throughput assay. Second, the editing function from 3'.fwdarw.5* exonuclease 
activity of a variety of polymerases may cause false positive results. This false 
positive potential often requires a further confirmation by sequencing analysis. On 
the contrary, gene microarray using oligonucleotide hybridization can analyze 
multiple polymorphisms and is compatible with high throughput assay. However, the 
latter method is not as sensitive and reliable as the one using primer extension 
followed by gel electrophoresis. The reason for less sensitive and less reliable of 
the oligonucleotide hybridization is because its basic principle : identifying 
polymorphism by comparing the signal /noise ratio based on one base mismatch. 



STATE COUNTRY RULE- 4 7 

CA US 
CA US 
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□ 6. Document ID: US 20040027350 Al 

L3: Entry 6 of 53 File: PGPB Feb 12, 2004 



PGPUB -DOCUMENT -NUMBER : 20040027350 
PGPUB- FILING-TYPE : new 

DOCUMENT- IDENTIFIER : US 20040027350 Al 

TITLE: Methods and system for simultaneous visualization and manipulation of 
multiple data types 

PUBLICATION-DATE: February 12, 2004 

INVENTOR- INFORMAT I ON : 

NAME CITY 

Kincaid, Robert Half Moon Bay 

Vailaya, Aditya Santa Clara 

US-CL-CURRENT: 345 / 440 
ABSTRACT : 

Software systems and methods for organizing and manipulating diverse data sets to 
facilitate identification, trends, correlations and other useful relationships 
among the data. Extremely large data set such as microarray data and other 
biological data are graphically displayed and sorted in an effort to develop visual 
similarities, correlations or trends that can be seen by a user of the present 
invention. Various schemes for graphical representations of the data, as well as 
sorting schemes are provided. Additionally, non-experimental or other data can be 
displayed and tracked along with the data upon which the sorting schemes are 
processed. 

L3: Entry 6 of 53 File: PGPB Feb 12, 2004 



DOCUMENT-IDENTIFIER: US 20040027350 Al 
TITLE: Methods and system for simultaneous visualization and manipulation of 
multiple data types 

Detail Description Paragraph : 

[0102] The column, row and manual sorting procedures described above can be useful 
in identifying correlations, trends and other relationships among the data in some 
instances. However, when dealing with large volumes of experimental data, such as 
microarray data sets or protein or other molecular data sets, the data sets are 
often sufficiently "noisy" that it is often difficult to find meaningful 
correlations by simply sorting a single column (e.g., a single array) or a single 
row {e.g., a single gene). When experimental data such as these are measured by 
very low level signals, there may be a lot variation in the measured values from 
experiment to experiment and they are inherently "noisy % . Microarrays are generally 
noisy due to a number of experimental variances. Microarrays are generally 
qualitatively reproducible, but the individual measurements will still show quite a 
bit of variance. Thus, if a sort is performed on the basis of a single or 



STATE COUNTRY RULE-47 
CA US 
CA US 
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individual array, slightly different ordering results are observed, as compared to 
the same sort performed on an array which is already known to be similar. These 
differences may even occur when a sorting procedure is performed on two different 
arrays representing the same experiment (i.e., a replicated experiment) due to 
differences in noise levels between the two arrays. To address these problems, the 
present invention further provides the capability of performing similarity sorting, 
which includes the ability to sort the data set by row or column similarity. 
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□ 7. Document ID: US 20030233197 Al 

L3: Entry 7 of 53 File: PGPB 

PGPUB- DOCUMENT -NUMBER : 20030233197 
PGPUB-FILING-TYPE: new 

DOCUMENT- IDENTIFIER : US 20030233197 Al 
TITLE: Discrete bayesian analysis of data 
PUBLICATION- DATE : December 18, 2003 



Dec 18, 2003 



INVENTOR-INFORMATION: 
NAME 

Padilla, Carlos E. 
Karlov, Valeri I. 



CITY 

Lexington 
Framingham 



STATE 

MA 

MA 



COUNTRY 

US 

US 



RULE-47 



US-CL-CURRENT: 702/20; 702/179, 702/19, 706/46, 706/52, 706/924, 707 / 104. 1 
ABSTRACT : 

A probabilistic approximation of a data distribution is provided, wherein uncertain 
measurements in data are fused together to provide an indication of whether a new 
data item belongs to a given disease model. The probabilistic approximation is 
provided in accordance with a Bayesian analysis technique that examines the 
relationship of probability distributions for observable events x and multiple 
hypotheses H regarding those events. 



L3: Entry 7 of 53 



File: PGPB 



Dec 18, 2003 



DOCUMENT-IDENTIFIER: US 20030233197 Al 
TITLE: Discrete bayesian analysis of data 



Detail Description Paragraph : 

[0312] Gene expression signatures allowing for discrimination of breast cancer 
patients exhibiting a short interval (<5 years) to distant metastases from those 
remaining free of metastases after 5 years were identified . The data set included 
78 patients: 44 patients with "good prognosis" (continued to be metastasis-free 
after at least 5 years) and 34 patients with "poor prognosis" (developed distant 
metastasis within 5 years) . All patients were lymph node negative and under 55 
years of age at diagnosis. Gene expression data for each patient was obtained from 
DNA microarrays containing 24,481 human genes and included the following fields: 
intensities, intensity ratios, and measurement noise characteristics (P-values) . 
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□ 8. Document ID: US 20030226963 Al 

L3: Entry 8 of 53 File: PGPB 



Dec 11, 2003 



PGPUB-DOCUMENT-NUMBER : 20030226963 
PGPUB-FILING-TYPE: new 

DOCUMENT- IDENTIFIER: US 20030226963 Al 

TITLE: System and method for the preparation of arrays of biological or other 
molecules 

PUBLICATION-DATE: December 11, 2003 



INVENTOR- INFORMATION : 
NAME 

Cooks, Robert G. 
Ouyang, Zheng 



CITY 

West Lafayette 
West Lafayette 



STATE 

IN 

IN 



COUNTRY 

US 

US 



RULE -47 



US-CL-CURRENT: 250/283 



ABSTRACT: 



A method of separating species in a mixture of molecules, particles or atoms and 
collecting the separated species is described. The method comprises the steps of 
converting by ionization the species in the mixture to gas phase ions, separating 
the gas phase ions according to their mass charge ratio and/or mobility and 
collecting the separated ions. The system includes ionizing means such as 
electrospray to form the gas phase ions. The gas phase ions are separated by 
filtering, or in time or in space and soft-landed for collection such as on a 
surface. 

L3: Entry 8 of 53 



File: PGPB 



Dec 11, 2003 



DOCUMENT-IDENTIFIER: US 20030226963 Al 

TITLE: System and method for the preparation of arrays of biological or other 
molecules 



Detail Description Paragraph : 

[0036] In one example proteins and biomolecules were soft-landed using a linear 
quadrupole mass filter. A commercial Thermo Finnigan {San Jose, Calif.) SSQ 710C, 
FIG. 3, was modified by adding an electrospray ionization (ESI) source. The source 
included a syringe 31 which introduced the protein mixture into the capillary 32. A 
high voltage (HV) was applied between the capillary 32 and the ionization chamber 
(not shown) for electrospray ionization. The various chambers (not shown) and 
elements of the instrument and their pressures are schematically shown and 
identified in FIG. 3. The microarray plate 13 was mounted for x-y movement in the 
last evacuated chamber. An x-y microarray plate drive is not shown since its 
construction is well within the skill of those practicing the art. In one example a 
flow rate of 0.5 .mu.l/min was used throughout the experiments. The surface for ion 
landing was located behind the detector assembly. In the ion detection mode, the 
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high voltages on the conversion dynode 33 and the multiplier 34 were turned on and 
the ions were detected to allow the overall spectral qualities, siqnal-to -noise 
ratio and mass resolution over the full mass range to be examined. In the ion- 
landing mode, the voltages on the conversion dynode and the multiplier were turned 
off and the ions were allowed to pass through the hole in the detection assembly to 
reach the gold surface of the plate 13. The surface was grounded and the potential 
difference between the source and the surface was 0 volts. 



Citation | Front j Review j Classification | Date | Reference"]" Sequences [ Attachments 



□ 9. Document ID: US 20030226098 Al 

L3: Entry 9 of 53 File: PGPB Dec 4, 2003 

P GPUB - DOCUME NT - NUMBE R : 20030226098 
PGPUB- FILING-TYPE : new 

DOCUMENT- IDENTIFIER : US 20030226098 Al 

TITLE: Methods for analysis of measurement errors in measured signals 

PUBLICATION- DATE: December 4, 2003 

INVENTOR-INFORMATION: 

NAME CITY STATE COUNTRY RULE- 4 7 

Weng, Lee Bellevue WA US 

US-CL-CURRENT: 714/798 
ABSTRACT : 

The present invention provides methods for analyzing measurement errors in measured 
signals obtained in an experiment, e.g., measured intensity signals obtained in a 
microarray gene expression experiment. In particular, the invention provides a 
method for transforming measured signals into a domain in which the measurement 
errors in the transformed signals are normalized by errors as determined from an 
error model. The methods of the invention are particularly useful for analyzing 
measurement errors in signals in which at least portion of the error is dependent 
on the magnitudes of the signals. Such transformed signals permit analysis of data 
using traditional statistical methods, e.g., ANOVA and regression analysis. 
Magnitude-independent errors can also be used for comparing level of measurement 
errors in signals of different magnitudes. 

L3: Entry 9 of 53 File: PGPB Dec 4, 2003 



DOCUMENT-IDENTIFIER: US 20030226098 Al 

TITLE: Methods for analysis of measurement errors in measured signals 
Detail Description Paragraph : 

[0053] Errors in measured signals can be described by error models (see, e.g., 
Supplementary material to Roberts et al, 2000, Science, 287:873-880; and Rocke et 
al., 2001, J. Computational Biology 8:557-569). In preferred embodiments, an error 
model (see, e.g., Supplementary material to Roberts et al, 2000, Science, 287:873- 
880; and Rocke et al . , 2001, J. Computational Biology 8:557-569) contains two or 
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three error terms to describe the dominant error sources. In a two-term error 
model, a first error term is used to describe the low-level additive error which 
comes from, e.g., the background of the array chip. Since this additive error has a 
constant variance, in this disclosure, it is also called the constant error. The 
constant error is independent from the hybridization levels of individual spots on 
a microarray . It may come from the combination of the scanner electronics noise 
and/or fluorescence due to nonspecific binding of fluorescence molecules to the 
surface of the microarray . In one embodiment, this constant additive error is taken 
to have a normal distribution with a mean bkg and a standard 

deviation . sigma . . sub . bkg . After background level subtraction, which is typically 
applied in microarray data processing, the additive mean bkg becomes zero. In this 
disclosure, it is often assumed that the background intensity offset has been 
corrected . An ordinary skilled artisan in the art will appreciate that in cases 
where the background mean is not corrected, the methods of the invention can be 
used with an additional step of making such a correction . 



ssifieation Pat* Reference Sequences Attachments 



□ 10. Document ID: US 20030224385 Al 

L3: Entry 10 of 53 File: PGPB Dec 4, 2003 

PGPUB- DOCUMENT-NUMBER : 20030224385 
PGPUB- FILING- TYPE : new 

DOCUMENT- IDENTIFIER : US 20030224385 Al 

TITLE: Targeted genetic risk-stratification using microarrays 
PUB LI CAT I ON- DATE: December 4, 2003 
INVENTOR- INFORMATION : 

NAME CITY STATE COUNTRY RULE-47 

Pihan, German Weston MA US 

US-CL-CURRENT: 435/6; 435/ 91. 2 
ABSTRACT: 

The invention relates to new expedient and cost-effective assays that are capable 
of identifying many or all relevant diagnostic and prognostic genetic lesions in 
cancer or cancer predisposition using multiplex PCR or other nucleic acid 
amplification or enrichment technology in conjunction with bead microarrays for the 
purpose of risk-stratifying patients with cancer or cancer predisposition. The new 
assay methods are referred to herein as BARCODE -MT for Bead ARray coded DEtection 
of Multiple Targets. These assays are high-throughput, and can be automated for 
highly accurate diagnoses that can be used to optimize risk-adapted therapy. 

L3: Entry 10 of 53 File: PGPB Dec 4, 2003 

DOCUMENT-IDENTIFIER: US 20030224385 Al 

TITLE: Targeted genetic risk-stratification using microarrays 
Detail Description Paragraph : 
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[0031] The new BARCODE -MT assay provides fast and accurate analysis of clinical 
samples by multiplexing target testing at both the amplification and detection 
steps. For multiplex detection, the invention uses bead microarrays because of 
their attractive cost/sample ratio and the flexibility of the format for devising 
new tests or modifying existing ones. The use of bead microarrays solves the 
conundrum of multiplex PCR, i.e., the increasing difficulty in unambiguously 
identifying a specifically amplified target by size alone (e.g., by gel 
electrophoresis or capillary electrophoresis) as the number of possible targets 
increases. Because bead microarrays are customized and compatible with fast flow 
read-through systems, fluid phase bead microarrays compare favorably with solid 
phase arrays for these types of assays. The adoption of a high-speed hybridization- 
based detection method is also amenable to automation, which is an important 
feature, because automation reduces operator input and the associated risk of 
error. Because of the favorable signal-to -noise ratio of our assay, case calling is 
unambiguous as demonstrated by the mixing and coding experiments described below, 
as well as the perfect correlation between single target PCR and BARCODE -MT . 
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□ 11. Document ID: US 20030219797 Al 

L3: Entry 11 of 53 File: PGPB 



Nov 27, 2003 



PGPUB-DOCUMENT-NUMBER : 20030219797 
PGPUB-FILING-TYPE: new 

DOCUMENT- IDENTIFIER : US 20030219797 Al 



TITLE: Statistical modeling to analyze large data arrays 



PUBLICATION-DATE: November 27, 2003 



INVENTOR- INFORMATION : 
NAME 

Zhao, Lue Ping 
Prentice, Ross 
Breeden, Linda 



CITY 

Bellevue 

Seattle 

Seattle 



STATE 
WA 
WA 
WA 



COUNTRY 

US 

US 

US 



RULE- 4 7 



US-CL-CURRENT: 435/6; 702/20 



ABSTRACT: 



A method for analyzing large data arrays is provided. In one aspect, the invention 
provides a method for analyzing data from two or more data arrays. Each array 
includes a plurality of members, each member provides a signal, and the data is 
indexed by one or more parameters. In one embodiment, the method includes fitting a 
model to the data; determining the goodness of the fit by evaluating the 
statistical significance of the fit; and determining the statistical significance 
of the signal. In another embodiment, the method further includes correcting the 
data for heterogeneity among members prior to fitting the model to the data. 

L3: Entry 11 of 53 File: PGPB Nov 27, 2003 



DOCUMENT-IDENTIFIER: US 20030219797 Al 
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TITLE: Statistical modeling to analyze large data arrays 



Detail Description Paragraph : 

[0070] Accordingly, one embodiment of the invention employs a statistical model 
(SPM) to identify and characterize single pulses of transcription that occur at 
invariant times in consecutive cell cycles. SPM is a specific application of 
statistical modeling, but the basic strategy can be applied to any large data set 
to identify genes undergoing a transcriptional response to a stimulus. Due to its 
relative simplicity, statistical modeling can be used to interrogate large data 
sets without employing additional filters to reduce the number of genes to be 
analyzed. It also includes heterogeneity parameters that will tend to reduce the 
impact of noise in the data sets. SPM identifies regularly oscillating transcripts 
without regard to the abundance of the transcript or the height or timing of the 
peak, and provides estimates of the mean time of activation and deactivation. These 
values are only estimates, but they are unbiased under the assumed SPM and can be 
considered defining characteristics of individual genes. SPM also provides 
statistical measures of the quality of the parameter estimates so that optimal 
groupings can be made and subjected to further analysis. These features of 
statistical modeling complement and augment the other methods used to analyze 
microarray data. 



Sequences Attachmer 



□ 12. Document ID: US 20030219764 Al 

L3: Entry 12 of 53 



File: PGPB 



Nov 27, 2003 



PGPUB- DOCUMENT -NUMBER : 20030219764 
PGPUB-FILING-TYPE: new 

DOCUMENT-IDENTIFIER: US 20030219764 Al 

TITLE: Biological discovery using gene regulatory networks generated from multiple- 
disruption expression libraries 

PUBLICATION-DATE: November 27, 2003 



INVENTOR-INFORMATION : 
NAME 

Imoto, Seiya 

Goto, Takao 

Miyano, Satoru 

Tashiro, Kosuke 

de Hoon, Michiel J.L 

Savoie, Christopher J. 

Kuhara, Saturo 



CITY 


STATE 


COUNTRY 


Tokyo 




JP 


Tokyo 




JP 


Tokyo 




JP 


Higashi-ku 




JP 


Tokyo 




JP 


Yanagawa 




JP 


Minami-Ku 




JP 



RULE-47 



US -CL- CURRENT: 435/6; 702/20 
ABSTRACT: 

Embodiments of this invention include application of new inferential methods to 
analysis of complex biological information, including gene networks. In some 
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embodiments, disruptant data and/or drug induction/inhibition data are obtained 
simultaneously for a number of genes in an organism. New methods include 
modifications of Boolean inferential methods and application of those methods to 
determining relationships between expressed genes in organisms. Additional new 
methods include modifications of Bayesian inferential methods and application of 
those methods to determining cause and effect relationships between expressed 
genes, and in some embodiments, for determining upstream effectors of regulated 
genes. Additional modifications of Bayesian methods include use of heterogeneous 
variance and different curve fitting methods, including spline functions, to 
improve estimation of graphs of networks of expressed genes. Other embodiments 
include the use of bootstrapping methods and determination of edge effects to more 
accurately provide network information between expressed genes. Methods of this 
invention were validated using information obtained from prior studies, as well as 
from newly carried out studies of gene expression. 
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DOCUMENT-IDENTIFIER : US 20030219764 Al 

TITLE: Biological discovery using gene regulatory networks generated from multiple- 
disruption expression libraries 



Detail Description Paragraph : 

[0178] In cDNA microarray experiments, gene expression levels are typically 
measured at a small number of time points. Conventional techniques of time series 
analysis, such as Fourier analysis or autoregressive or moving-average modeling, 
are not suitable for such a small number of data points. Instead, the gene 
expression data are often analyzed by clustering techniques or by considering the 
relative change in the gene expression level only. Such a "fold-change" analysis 
may miss significant changes in gene expression levels, while it may inadvertently 
attribute significance to measurements dominated by noise. In addition, a fold- 
change analysis may not be able to identify important features in the temporal gene 
expression response. 
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US-CL-CURRENT: 435/6; 702/20 



ABSTRACT : 

The invention provides methods for diagnosing biological states or conditions based 
on ratios of gene expression data from tissue samples, such as cancer tissue 
samples. The invention also provides sets of genes that are expressed 
differentially in malignant pleural mesothelioma. These sets of genes can be used 
to discriminate between normal and malignant tissues, and between classes of 
malignant tissues. Accordingly, diagnostic assays for classification of tumors, 
prediction of tumor outcome, selecting and monitoring treatment regimens and 
monitoring tumor progression/regression also are provided. 
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DOCUMENT -IDENTIFIER : US 20030219760 Al 
TITLE: Diagnostic and prognostic tests 

Detail Description Paragraph : 

[0247] Identification of prognostic molecular markers in mesothelioma. We have 
previously identified for study a representative cohort of 31 mesothelioma tumors 
obtained at pneumonectomy (17) . The estimated median patient survival (11 months, 
FIG. 6A) and histological distribution of this group mirror those of mesothelioma 
patients in our practice (6) . The histological subtype of the tumor was not 
predictive of outcome for these samples (P=0.129, log-rank test, FIG. 6B) , even 
though the estimated median survival of epithelial subtypes samples (17 months) was 
longer than that for non-epithelial subtype samples (8.5 months). To identify genes 
that are discriminatory between tumors from patients with widely divergent survival 
and to create an expression ratio-based predictor model, we utilized microarray 
data (17) for mesothelioma samples that originated from patients whose survival was 
within the 25.sup.th percentile of both disease-related survival extremes 
irrespective of tumor histological subtype (i.e., the training set, n=17, Table 
9A) . We formed two groups using these samples: relatively good outcome 
(survival .gtoreq.17 months, n=8) and relatively poor outcome (survival .ltoreq.6 
months, n=9) . The most accurate model developed in the training set was 
subsequently tested in an independent cohort of samples (i.e. the test set, n=29, 
Table 9B) . We searched all of the genes represented on the microarray for those 
with a statistically significant . gtoreq . 2-fold difference in average expression 
levels between good outcome and poor outcome tumors in the training set of samples. 
To minimize the effects of background noise, the list of distinguishing genes was 
further refined by requiring that the mean expression level be >500 in at least one 
of the two sample sets. We identified a total of 46 prognostic genes in this 
analysis with an estimated false discovery rate of 10%-20%. The 10 genes with the 
lowest P values overexpressed in each group are listed in Table 10. 



Reference Sequences 



□ 14. Document ID: US 20030219150 Al 



L3: Entry 14 of 53 



File: PGPB 



Nov 27, 2003 



PGPUB- DOCUMENT -NUMBER : 20030219150 
PGPUB- FILING-TYPE : new 

DOCUMENT-IDENTIFIER: US 20030219150 Al 



http://westbrs:9000ftin/gate.exe?^ 4/11/04 



Record List Display 



Page 15 of 60 



TITLE: Method, system, and computer code for finding spots defined in biological 
microarrays 

PUBLICATION-DATE : November 27, 2 003 



INVENTOR- INFORMATION : 
NAME 

Niles, Richard K. 
Martens, Christine L. 
Foskett, Darach B. 



CITY 

San Francisco 
Portola Valley 
San Francisco 



STATE 
CA 
CA 
CA 



COUNTRY 

US 

US 

US 



RULE-47 



US-CL-CURRENT : 382/12 8 



ABSTRACT : 



A method (and system) for using information contained within the scanned image to 
create, in an automated (or semi-automated) process, an accurate data grid. The 
process has steps: enhance the image; locate blocks of spots; and find each 
individual spot in each of the blocks. Preferably, the method makes use of image 
filtering using a "Principal Frequency Filter" based on a mathematical 
determination of major periodic elements in the image to eliminate noisy, non- 
periodic signals, and of smoothed intensity profiles of the filtered image data. 
Here, the term Principal Frequency Filter is used to indicate an image-enhancing 
filter based upon a mathematical operation which identifies the major periodic 
components of the image. 
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TITLE: Method, system, and computer code for finding spots defined in biological 
microarrays 



Summary of Invention Paragraph : 

[0013] In an alternative specific embodiment, the invention provides a method for 
processing data in digital images of biological microarrays to identify one or more 
groupings of spots present in the microarray image. The method receives a captured 
image of a biological microarray of spots in an electronic format. The array 
itself, which is the source of the image, in most cases is composed of a plurality 
of groupings of spots. The groupings may be defined by 1 through N, where N is an 
integer greater than 1, and each of the groupings is separated by an isolation 
region, which is substantially free from any spots. The method processes the 
captured electronic image to reduce background noise . A step of identifying at 
least the isolation region between the groupings in the captured image using a 
filter applied to the captured image is .included. The filter is described according 
to periodic components of the captured image, where the periodic components are 
defined by a spatial distribution of the spots in the array image. The method 
determines the boundaries of each grouping in the digital image of the biological 
microarray to isolate any one of the groupings from any one of the other groupings, 
and stores the locations of each grouping in the image of the biological microarray 
into memory. 

Summary of Invention Paragraph : 

[0018] In a specific embodiment, many array manufacturing methods print DNA spots 
in groups of approximately uniform size, e.g., blocks of 25 columns and 28 rows, 
though any numbers can be chosen. In some cases there may be variation in the size 
of blocks found on an array. Localization of spots is simplified and improved by 
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analyzing separately each block or grouping of spots on the image of the 
microarray . Thus the invention first uses a procedure to locate, on the image of 
the microarray, the limits of each block by identifying the spaces between blocks, 
in which the only signals present are background noise . These spaces can be 
detected as troughs in a smoothed, filtered one-dimensional intensity profile which 
is obtained by averaging the intensity of fluorescence emission along rows or 
columns in the image of the microarray . However, factors often can cause a 
significant reduction of the depth of such intensity profile troughs: primarily, 
deviation of the rows or columns from the perfect perpendicular, and background 
noise . The invention uses processes to avoid these problems. To minimize or reduce 
the damping effects of rotation on the intensity profile, rows of blocks along the 
shorter dimension of the image are located first. Then each row of blocks is 
analyzed to locate individual blocks within that row. 

Summary of Invention Paragraph : 

[0019] Further, to reduce the effect of background fluorescence ( " noise ") , the 
invention uses the Principal Frequency Filter, or PFF, as will be referenced 
herein. Briefly, this filter is defined from the digital image of the microarray by 
calculating a one-dimensional Fourier transformation and deriving from it the power 
spectrum of each row of pixels. Preferably, spectra are averaged across all rows of 
pixels in the image, and all peaks that occur in a small neighborhood of defined 
width are located. The first local peak away from zero frequency is the Principal 
Frequency, which is the spatial frequency determined by the repeating spot pattern. 
Random, non-periodic background or noise is spread out in the Fourier domain; 
periodic signals are concentrated into the Principal Frequency, with little 
contamination of this frequency by noisy nonperiodic signals. Utilizing Fourier 
transformation thus separates the pattern of important signals from spurious 
background signals. An intensity profile is determined from the power spectra of 
individual rows of pixels in a small neighborhood of the Principal Frequency, which 
eliminates the contributions of non-periodic background. The Principal Frequency 
intensity profile is then smoothed by a value of 1.5 times the estimated row 
spacing to reduce further the impact of noise and interspot spacing on the profile. 
Thresholding of the filtered intensity profile identifies locations of block edges 
along the long dimension of the array. 

Detail Description Paragraph : 

[0038] The block-finding process is based on the premise that the gaps in 
fluorescence intensity which occur between blocks can be located by identifying 
troughs in a smoothed one-dimensional intensity profile obtained by averaging the 
intensity of fluorescence emission along rows or columns. However, both rotation 
and background noise, common characteristics of printed microarrays which are 
likely to be encountered on nearly every array, can significantly reduce the depth 
of the intensity profile troughs, so the invention uses two different methods to 
correct for these potential problems. To minimize the deleterious effect of 
rotation, rows of blocks along the shorter dimension of the image are found first, 
and then each row is analyzed to locate individual blocks within the row. To 
minimize the effect of noise, this invention uses an image enhancing filter known 
as the "Principal Frequency Filter" (PFF) . This filter is defined in the following 
manner. The one-dimensional power spectrum is calculated for each row of pixels: 
the number of pixels in each row is increased to a power of 2 to allow a fast 
Fourier transformation (FFT) to be used. The FFT is calculated; and the power 
spectrum is computed by summing the squares of the real and imaginary components of 
the FFT. Next, the Principal Frequency is identified : the one-dimensional row power 
spectra are averaged across all rows of pixels in the image to compute the average 
row power spectrum; all peaks are located which occur in the average row power 
spectrum in neighborhoods of an optimal width, for example 2.5%, of the adjusted 
row dimension. The first non-zero local peak is the Principal Frequency, which is 
the spatial frequency determined by the repeating spot pattern. "Peaks" are defined 
either as local maxima within the defined neighborhood for a positive image, in 
which spots appear as bright areas on a dark background, or as local minima within 



http://westbrs:9000ftin/gat^^ 4/11/04 



Record List Display 



Page 17 of 60 



the defined neighborhood for a negative image, in which spots appear as dark areas 
on a bright background. For simplicity, we refer to peaks as high values or local 
maxima in a positive image, but the alternative definition is also intended. 

CLAIMS : 

4. A method for processing data in digitized images of biological microarrays to 
identify one or more groupings of spots present in the microarray, the method 
comprising: importing a captured digitized image of a biological microarray of 
spots in an electronic format, the array comprising a plurality of approximately 
rectangular groupings of spots, called blocks, the groupings being defined by 1 
through N, where N is an integer greater than 1, the blocks being arranged in a 
regular pattern, with rows and columns of blocks being separated by substantially 
horizontal and vertical isolation regions comprising background, the background 
regions being long, narrow areas approximating a rectangular shape, which are 
substantially free from any spots; processing the captured image to reduce 
background noise from the captured image; identifying at least the isolation region 
between the groupings in the captured image using a frequency domain filter applied 
to the captured image, the filter being constructed according to periodic 
components of the captured image, the periodic components being defined by a 
spatial distribution of the spots in the captured image of the microarray ; 
determining the locations of the boundaries of the groupings in the captured image 
of the biological microarray to isolate any one of the groupings from any one of 
the other groupings; and storing the locations of the boundaries of the groupings 
in the captured image of the biological microarray into memory. 
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ABSTRACT : 



Methods of selecting a portfolio of markers for use in a diagnostic applications 
include defining diagnostic parameters, establishing a relationship among the 
parameters so that they are optimized, and selecting an optimal group of markers 
for the diagnostic application. The diagnostic parameters can include a measure of 
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the relative degree of expression of a gene, a measure of the variation in the 
measurement of the degree of expression of the gene, and the relationship between 
the diagnostic and discriminating parameters can be a mean variance relationship. 

Machines programmed to conduct the method and articles that comprise instructions 
for their operation are further aspects of the invention. 
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Summary of Invention Paragraph : 

[0014] Genes can be grouped so that information obtained about the set of genes in 
the group provides a sound basis for making clinically relevant judgments such as a 
diagnosis, prognosis, or treatment choice. These sets of genes make up the 
portfolios of the invention. As with most diagnostic markers, it is often desirable 
to use the fewest number of markers sufficient to make a correct medical judgment. 
This prevents a delay in treatment pending further analysis as well as 
inappropriate use of time and resources. Preferred optimal portfolio is one that 
employs the fewest number of markers for making such judgments while meeting 
conditions that maximize the probability that such judgments are indeed correct . 
These conditions will generally include sensitivity and specificity requirements. 
In the context of microarray based detection methods, the sensitivity of the 
portfolio can be reflected in the fold differences exhibited by a gene's expression 
in the diseased or aberrant state relative to the normal state. The detection of 
the differential expression of a gene is sensitive if it exhibits a large fold 
change relative to the expression of the gene in another state. Another aspect of 
sensitivity is the ability to distinguish signal from noise . For example, while the 
expression of a set of genes may show adequate sensitivity for defining a given 
disease state, if the signal that is generated by one (e.g., intensity measurements 
in microarrays ) is below a level that easily distinguished from noise in a given 
setting (e.g., a clinical laboratory) then that gene should be excluded from the 
optimal portfolio. A procedure for setting conditions such as these that define the 
optimal portfolio can be incorporated into the inventive methods. 
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US-CL-CURRENT: 435 /6; 536 / 24.3 
ABSTRACT : 

A method of diagnosing cancer by identifying differential modulation of each gene 
(relative to the expression of the same genes in a normal population) in a 
combination of genes selected from two groups of genes. 

Gene expression portfolios and kits for employing the method are further aspects of 
the invention. 

L3: Entry 16 of 53 File: PGPB Oct 16, 2003 
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Summary of Invention Paragraph : 

[0013] Genes can be grouped so that information obtained about the set of genes in 
the group provides a sound basis for making clinically relevant judgments such as a 
diagnosis, prognosis, or treatment choice. These sets of genes make up the 
portfolios of the invention. As with most diagnostic markers, it is often desirable 
to use the fewest number of markers sufficient to make a correct medical judgment. 
This prevents a delay in treatment pending further analysis as well as 
inappropriate use of time and resources. Preferred optimal portfolio is one that 
employs the fewest number of markers for making such judgments while meeting 
conditions that maximize the probability that such judgments are indeed correct . 
These conditions will generally include sensitivity and specificity requirements. 
In the context of microarray based detection methods, the sensitivity of the 
portfolio can be reflected in the fold differences exhibited by a gene's expression 
in the diseased or aberrant state relative to the normal state. The detection of 
the differential expression of a gene is sensitive if it exhibits a large fold 
change relative to the expression of the gene in another state. Another aspect of 
sensitivity is the ability to distinguish signal from noise . For example, while the 
expression of a set of genes may show adequate sensitivity for defining a given 
disease state, if the signal that is generated by one {e.g., intensity measurements 
in microarrays ) is below a level that easily distinguished from noise in a given 
setting (e.g., a clinical laboratory) then that gene should be excluded from the 
optimal portfolio. A procedure for setting conditions such as these that define the 
optimal portfolio can be incorporated into the inventive methods. 
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ABSTRACT : 



A system and methods for identifying sequence diversity in a gene or expressed 
sequence is disclosed wherein hybridization differences arising from polymorphic 
bases in analogous expressed sequences are identified between two or more 
nucleotide populations. By scaling the hybridization data to account for 
differences in abundance and observed intensity, sequence diversity can be 
identified in a highly specific and sensitive manner. Data confidence levels are 
also accounted for to increase the accuracy of the sequence diversity 
determination. The invention can be applied to both newly collected gene expression 
data and archived data to generate valuable insight into polymorphic behavior 
within complex nucleotide populations. 
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Detail Description Paragraph : 

[0062] In one embodiment, during or subsequent to the abovementioned differencing 
or scaling procedures, an outlier removal function may be used to identify and 
remove data that exceeds a statistical threshold of reliability. Outlier data is 
characterized by spurious or inaccurate hybridization/intensity values and may not 
reflect actual hybridization intensities expected for a particular sequence. 
Outlier data may arise from experimental error or inaccuracy, systematic errors, 
localized regions of abnormally high and/or low intensity on the surface of the 
microarray, and other sources. The analytical methods for sequence diversity 
determination may desirably incorporate methods to identify outlier data for 
example by identifying data which exceeds a statistical threshold of reliability. 
Furthermore, inadequate hybridization signals from samples where a gene is not 
present are filtered out before analysis to prevent identifying differences due 
only to a background signal of noise and not to real sequence differences. These 
data may then be excluded from further analysis to minimize improperly identified 
sequence differences (e.g., false positives). Additional details of the methods 
used for determining statistical reliability of the data will be described in 
greater detail hereinbelow. 
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ABSTRACT: 

A method of prognosticating metastasis in a breast cancer patient involves 
identifying differential modulation of each gene (relative to the expression of the 
same genes in a normal population) in a combination of genes selected from a group 
consisting of genes. 

Gene expression portfolios and kits for employing the method are further aspects of 
the invention. 

L3: Entry 18 of 53 File: PGPB Oct 9, 2003 



DOCUMENT-IDENTIFIER: US 20030190656 Al 
TITLE: Breast cancer prognostic portfolio 



Summary of Invention Paragraph : 

[0013] Genes can be grouped so that information obtained about the set of genes in 
the group provides a sound basis for making clinically relevant judgments such as a 
diagnosis, prognosis, or treatment choice. These sets of genes make up the 
portfolios of the invention. As with most diagnostic markers, it is often desirable 
to use the fewest number of markers sufficient to make a correct medical judgment. 
This prevents a delay in treatment pending further analysis as well as 
inappropriate use of time and resources. Preferred optimal portfolio is one that 
employs the fewest number of markers for making such judgments while meeting 
conditions that maximize the probability that such judgments are indeed correct . 
These conditions will generally include sensitivity and specificity requirements. 
In the context of microarray based detection methods, the sensitivity of the 
portfolio can be reflected in the fold differences exhibited by a gene's expression 
in the diseased or aberrant state relative to the normal state. The detection of 
the differential expression of a gene is sensitive if it exhibits a large fold 
change relative to the expression of the gene in another state. Another aspect of 
sensitivity is the ability to distinguish signal from noise . For example, while the 
expression of a set of genes may show adequate sensitivity for defining a given 
disease state, if the signal that is generated by one (e.g., intensity measurements 
in microarrays ) is below a level that easily distinguished from noise in a given 
setting (e.g., a clinical laboratory) then that gene should be excluded from the 
optimal portfolio. A procedure for setting conditions such as these that define the 
optimal portfolio can be incorporated into the inventive methods. 
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ABSTRACT: 

A method of assessing the presence or absence of colorectal cancer or the likely 
condition of a person believed to have colorectal cancer is conducted by analyzing 
the expression of a group of genes. Gene expresson profiles in a variety of medium 
such as microarrays are included as are kits that contain them. 

L3: Entry 19 of 53 File: PGPB Oct 2, 2003 
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Summary of Invention Paragraph : 

[0016] Statistical values can be used to confidently distinguish modulated from 
non-modulated genes and noise . Statistical tests find the genes most significantly 
different between diverse groups of samples. The Student f s t-test is an example of 
a robust statistical test that can be used to find significant differences between 
two groups. The lower the p-value, the more compelling the evidence that the gene 
is showing a difference between the different groups. Nevertheless, since 
microarrays measure more than one gene at a time, tens of thousands of statistical 
tests may be asked at one time. Because of this, there is likelihood to see small 
p-values just by chance and adjustments for this using a Sidak correction as well 
as a randomization/permutation experiment can be made. A p-value less than 0.05 by 
the t-test is evidence that the gene is significantly different. More compelling 
evidence is a p-value less then 0.05 after the Sidak correct is factored in. For a 
large number of samples in each group, a p-value less than 0.05 after the 
randomization/permutation test is the most compelling evidence of a significant 
difference . 
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ABSTRACT : 

A method of assessing the presence or absence of colorectal cancer or the likely 
condition of a person believed to have colorectal cancer is conducted by analyzing 
the expression of a group of genes. Gene expresson profiles in a variety of medium 
such as microarrays are included as are kits that contain them. 
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Summary of Invention Paragraph : 

[0016] Statistical values can be used to confidently distinguish modulated from 
non-modulated genes and noise . Statistical tests find the genes most significantly 
different between diverse groups of samples. The Student's t-test is an example of 
a robust statistical test that can be used to find significant differences between 
two groups. The lower the p-value, the more compelling the evidence that the gene 
is showing a difference between the different groups. Nevertheless, since 
microarrays measure more than one gene at a time, tens of thousands of statistical 
tests may be asked at one time. Because of this, there is likelihood to see small 
p-values just by chance and adjustments for this using a Sidak correction as well 
as a randomization/permutation experiment can be made. A p-value less than 0.05 by 
the t-test is evidence that the gene is significantly different. More compelling 
evidence is a p-value less then 0.05 after the Sidak correct is factored in. For a 
large number of samples in each group, a p-value less than 0.05 after the 
randomization/permutation test is the most compelling evidence of a significant 
difference . 
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ABSTRACT : 



The invention comprises novel methods and strategies to detect and/or quantify 
nucleic acid analytes. The methods involve nucleic acid probes with covalently 
conjugated dyes, which are attached either at adjacent nucleotides or at the same 
nucleotide of the probe and novel linker molecules to attach the dyes to the 
probes. The nucleic acid probes generate a fluorescent signal upon hybridization to 
complementary nucleic acids based on the interaction of one of the attached dyes, 
which is either an intercalator or a DNA groove binder, with the formed double 
stranded DNA. The methods can be applied to a variety of applications including 
homogeneous assays, real-time PCR monitoring, transcription assays, expression 
analysis on nucleic acid microarrays and other microarray applications such as 
genotyping (SNP analysis) . The methods further include pH-sensitive nucleic acid 
probes that provide switchable fluorescence signals that are triggered by a change 
in the pH of the medium. 
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Detail Description Paragraph : 

[0207] In a conventional assay that is based on microarrays with fluorescent 
methods, the array needs to be stringently washed in order to remove non-hybridized 
fluorescent nucleotide sequences that interfere with the detection of the 
fluorescent signal and increase the signal to noise ratio in the assay. In these 
assays a delicate balance must be achieved with respect to the stringency of the 
washing. A stringency that is too high removes correctly hybridized sequences, 
which decreases positive signals in the assay, and a stringency that is too low 
increases false positive signals leading to unreliable and erroneous results. 
Often, unspecific absorption of fluorescent nucleic acids on the surface of 
microarrays is a major problem in these assays as the background fluorescence 
generated from such absorption processes can't be removed effectively. This is 
particularly true for surfaces modified with chemical agents that provide aminated 
surfaces. An aminated surface provides a positively charged matrix that attracts 
all negatively charged nucleic acid species in a sample in a non-specific manner. 

Detail Description Paragraph : 

[0208] The nucleic acid probes described herein require less stringent washes 
because the probes generate positive signals if and only if they are correctly 
hybridized on the array. Under optimized conditions the nucleic acid probes can be 
applied even without washes, because the probes themselves are non-fluorescent and 
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do not produce a background. This feature greatly diminishes the cost of microarray 
based assays, increases the signal to noise ratio of the assay and therefore its 
reliability and allows the analysis of smaller and/or more dilute nucleic acid 
target samples. 
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ABSTRACT: 



The present invention is directed to deconvolution and normalization of assay data 
The present invention includes a control and analysis system, used in conjunction 
with a signal generation and detection apparatus, for capturing, processing and 
analyzing images of samples having resonance light scattering (RLS) particle 
labels. The control and analysis system processes instructions and algorithms for 
performing multiplexed assays of two or more colors, for example, to allow 
separation and analysis of detected light that contains information from two or 
more different types or sizes of RLS particles. The multiplexing analysis software 
is preferably incorporated within the system of the present invention, and the 
multiplexing analysis is preferably performed in real-time during a scanning or 
assay procedure. The invention provides for a computer readable medium containing 
instructions for carrying out the same. 
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Detail Description Paragraph : 

[0583] As indicated hereinabove, utilization of RLS detection methods can involve 
various techniques of data processing, typically utilized for the purpose of 
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providing such enhancements as greater signal to noise ratios, increased 
sensitivity, extended dynamic range, comparability within and/or between assays, 
and to identify particular features. These data processing techniques can be 
incorporated as sets of instructions embedded in software and/or hardware that 'is 
part of an analyzer and/or embedded in a computer with a data connection to an 
analyzer. Such software is typically recorded in an ordinary computer storage 
medium, including, but not limited to: volatile or non-volatile random access 
memory (RAM), read only memory (ROM), magnetic tape, hard or floppy disks, and 
optical storage media such as compact disks (CD, CD-ROM) . Further assay methods, 
preferably array-based methods and especially microarray assay methods, can 
advantageously utilize such software and/or hardware. 
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ABSTRACT: 



A method of obtaining a corrected image of a microarray includes acquiring an image 
of a microarray including a target spot, and processing the image to correct for 
background noise and chip misalignment. The method also includes analyzing the 
image to identify a target patch, edit debris, and correct for ratio bias; and 
detecting single copy number variation in the target spot using an objective 
statistical analysis that includes a t-value statistical analysis. The method 
provides statistically robust computational processes for accurately detecting 
genomic variation at the single copy level. 
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Abstract Paragraph : 

A method of obtaining a corrected image of a microarray includes acquiring an image 
of a microarray including a target spot, and processing the image to correct for 
background noise and chip misalignment. The method also includes analyzing the 
image to identify a target patch, edit debris, and correct for ratio bias; and 
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detecting single copy number variation in the target spot using an objective 
statistical analysis that includes a t-value statistical analysis. The method 
provides statistically robust computational processes for accurately detecting 
genomic variation at the single copy level. 

Summary of Invention Paragraph : 

[0005] According to one aspect of the invention, a method includes acquiring an 
image of a genomic microarray including a target spot; processing the image to 
correct for background noise and chip misalignment; analyzing the image to identify 
a target patch, edit debris, and correct for ratio bias; and, detecting single copy 
number variation in the target spot using an objective statistical analysis that 
includes a t-value statistical analysis. In some embodiments, the target spot is 
Deoxyribonucleic Acid (DNA) . One or more of the following features may also be 
included. 

Summary of Invention Paragraph : 

[0010] Further, subsequent to acquiring the image of the genomic microarray and 
prior to correcting the background noise, processing the image includes 
automatically detecting misalignment of the genomic microarray and correcting its 
rotation. 

Summary of Invention Paragraph : 

[0026] According to another aspect of the invention, a computer program product 
residing on a computer readable medium causes the processor to acquire an image of 
a genomic microarray including a target spot; process the image for correcting of a 
background noise and chip misalignment; analyze the image for identifying a DNA 
patch, editing of debris, and correcting a ratio bias. The computer program product 
also causes the processor to detect of single copy number variation in the target 
spot using an objective statistical analysis. 

Summary of Invention Paragraph : 

[0029] Subsequent to causing the processor to acquire the image of the genomic 
microarray and prior to causing the processor to process the image for correcting 
the background noise, the computer program product further causes the processor to 
process the image further includes automatically causing the processor to detect 
misalignment of the genomic microarray and correct rotation of the genomic 
microarray . The genomic material includes a range between 50 kbp and 2 00 kbp. 

Summary of Invention Paragraph : 

[0042] According to another aspect of the invention, a processor and a memory are 
provided which are configured to acquire an image of a genomic microarray including 
a target spot; to process the image for correcting of a background noise and chip 
misalignment; to analyze the image for identifying a DNA patch, editing of debris, 
and correcting a ratio bias; and, to detect single copy number variation in the 
target spot using an objective statistical analysis. 

Summary of Invention Paragraph : 

[0043] According to yet another aspect of the invention, a system includes means 
for acquiring an image of a genomic microarray including a target spot; means for 
processing the image for correcting of a background noise and chip misalignment; 
means for analyzing the image for identifying a DNA patch, editing of debris, and 
correcting a ratio bias; and, means for detecting single copy number variation in 
the target spot using an objective statistical analysis. 

Summary of Invention Paragraph : 

[0052] The methods provide a robust microarray computing system for obtaining data 
with superior quality and experimental results. Specifically, the image acquisition 
process provides higher quality images due to improved noise correction mechanisms 
and tightly integrated hardware and software components. Consequently, higher 
quality images are examined providing the automated tools a better analytical input 
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for giving the user more reliable and accurate results. In addition, important 
experimental results associated with statistical values are provided with a higher 
degree of confidence. 

Detail Description Paragraph : 

[0068] As described below, methods and systems of obtaining a corrected image of a 
microarray include acquiring an image of a microarray including a target spot, 
processing the image to correct for background noise and chip misalignment, 
analyzing the image to identify a target patch, edit debris, and correct for ratio 
bias, and detecting single copy number variation in the target spot using an 
objective statistical analysis that includes a t-value statistical analysis. 



1. A method of obtaining a corrected image of a microarray, the method comprising: 
acquiring an image of a microarray including a target spot; processing the image to 
correct for background noise and chip misalignment; analyzing the image to identify 
a target patch, edit debris, and correct for ratio bias; and, detecting single copy 
number variation in the target spot using an objective statistical analysis that 
includes a t-value statistical analysis. 

9. The method of claim 1, wherein subsequent to acquiring the image of the genomic 
microarray and prior to correcting for the background noise, processing the image 
includes automatically detecting misalignment of the genomic microarray and 
correcting for rotation of the genomic microarray . 

29. A computer program product residing on a computer readable medium having 
instructions stored thereon which; when executed by the processor, cause the 
processor to: acquire an image of a microarray including a target spot; process the 
image to correct for background noise and chip misalignment; analyze the image to 
identify the target patch, edit debris, and correct for ratio bias; and, detect 
single copy number variation in the target spot using an objective statistical 
analysis . 

32. The computer program product of claim 29, wherein subsequent to causing the 
processor to acquire the image of the genomic microarray and prior to causing the 
processor to process the image for correcting the background noise, causing the 
processor to process the image further includes automatically causing the processor 
to detect misalignment of the genomic microarray and correct rotation of the 
genomic microarray . 

46. A processor and a memory configured to: acquire an image of a microarray 
including a target spot; process the image to correct for background noise and chip 
misalignment; analyze the image to identify a target patch, edit debris, and 
correct for ratio bias; and, detect single copy number variation in the target spot 
using an objective statistical analysis. 

47. A system comprising: means for acquiring an image of a microarray including a 
target spot; means for processing the image to correct for background noise and 
chip misalignment; means for analyzing the image to identify a target patch, edit 
debris, and correct for ratio bias; and, means for detecting single copy number 
variation in the target spot using an objective statistical analysis. 



CLAIMS : 
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ABSTRACT: 

Methods, systems, and computer program products for analyzing images of high 
density microarray chips analyze the image by estimating background using a 
blurring kernel and/or a spatial multivariate statistical model of the background. 
The methods, systems, and computer program products can employ a multivariate 
statistical model and/or a blurring kernel to obtain more representative 
hybridization intensity results, particularly for pixels in boundary regions of the 
probe cells. The methods allow for alternative microarray configurations of nucleic 
acid probes and do not require the use of mismatch probes and can be independent of 
the type of nucleotide sequence used. Associated microarrays and systems are also 
described. 
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Detail Description Paragraph : 

[0056] Referring now to FIG. 5, exemplary operations for analyzing an image to 
account for background illumination and/or noise are illustrated. As shown, an 
image of an expressed microarray is obtained (block 130) . In certain embodiments, 
to establish the estimated level of background /noise in the image, the extracted 
raw image data at the resolution of individual pixels can be obtained and analyzed. 
Still referring to FIG. 5, one or more individual probe cell locations in the image 
may be positionally estimated in the image as desirable (block 135) . The estimates 
of the probe cell locations may be provided in any suitable manner such as via 
conventional operations or as described in U.S. Pat. Nos . 6,090,555 and 5,631,734, 

and co-pending, co-assigned U.S. Provisional Patent Application Serial No. 

identified by Attorney Docket No. 5405-261PR; the contents of these documents are 
hereby incorporated by reference as if recited in full herein. 
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ABSTRACT: 



A novel coupled two-way clustering approach to gene microarray data analysis, for 
identifying subsets of the genes and samples, such that when one of these items is 
used to cluster the other, stable and significant partitions emerge. The method of 
the present invention preferably uses iterative clustering in order to execute this 
search in an efficient way. This approach is especially suitable for gene 
microarray data, where the contributions of a variety of biological mechanisms to 
the gene expression levels are entangled in a large body of experimental data. The 
method of the present invention was applied to two gene microarray data sets, on 
colon cancer and leukemia. By identifying relevant subsets of the data and focusing 
on these subsets, partitions and correlations were found that were masked and 
hidden when the full data set was used in the analysis. 
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Detail Description Paragraph : 

[0051] The coupled two-way clustering method of the present invention is a general 
way to analyze gene microarray data, and may optionally be used with any suitable 
clustering algorithm, such that the present invention is not limited to any 
particular clustering algorithm. A particularly preferred clustering algorithm, 
which is used in the examples described in greater detail below, is the super- 
paramagnetic clustering algorithm (SPC) [9, 10, 11, 12] . This algorithm is 
especially suitable for gene microarray data analysis due to its robustness against 
noise and its "natural" ability to identify stable clusters. 

Detail Description Paragraph : 

[0141] The coupled two-way clustering method of the present invention has been 
demonstrated to be computationally feasible for the cases which were studied. One 
of the reasons is that the stable clusters generated by the procedure become small 
with increasing iterations. Therefore their clustering analysis gets faster, and 
the method typically stops after only a few iterations. The method of the present 
invention is applicable with any reasonable, suitable choice of clustering 
algorithm, as long as the selected algorithm is capable of identifying stable 
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clusters. The examples of analyses which were described above concerned one 
exemplary but preferred clustering algorithm, which is super-paramagnetic 
clustering algorithm (SPC) . This algorithm is especially suitable for gene 
microarray data analysis due to its robustness against noise which is inherent in 
such experiments . 
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ABSTRACT : 



The present invention to provides a solution to problems associated with the use of 
hybridization for genetic analysis, including but not limited to the use of 
microarray technology for the analysis DNA. The present invention provides 
compositions and methods for the use reflections of DNA in genetic analysis. The 
present invention is also directed to methods for the production of reflections of 
DNA. 
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Detail Description Paragraph : 

[0032] The principle of this method is to use the collection of fragments, for 
example fragments arrayed in a microarray, to isolate the complimentary fragments 
from a sample for analysis. This creates a sample for hybridization that has a 
complexity on the order of the array being used for hybridization. By doing this 
the complexity of the sample can be dropped enormously. This in turn allows for 
better signal to noise for the probes on the array. This attribute allows the 
identification of specific fragments from genomes of size and complexity that could 
not normally be analyzed by conventional methods. The method of the invention can 
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be used to analyze genome copy number in samples such as human genomic DNA compared 
on cDNA arrays. Reflection of normal and tumor DNA samples are compared to identify 
regions of the genome that undergo copy number fluctuation in cancer corresponding 
to the cDNAs or genes on the array. 
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ABSTRACT: 



A technique is described for determining the effectiveness of medical therapy and 
dosage formulations by analyzing dot spectrograms representative of quantized 
hybridization activity in biological samples, such as DNA, RNA or other protein 
biomolecular array samples, taken at different sampling times from a patient 
undergoing the treatment. The technique directly lends itself to disease 
progression analysis based on markers such as viral load. In accordance with the 
technique, a viral diffusion curve associated with a therapy of interest is 
generated and each dot spectrogram is then mapped to the viral diffusion curve 
using fractal filtering to yield a filtered viral diffusion curve for each sample. 
A degree of convergence between the filtered viral diffusion curves is determined. 
Then, a determination is made as to whether the therapy of interest has been 
effective by determining whether the degree of convergence increases from one 
sample to another, with an increase in the degree of convergence being 
representative of a lack of effectiveness of the therapy of interest. In a specific 
example described herein, the viral diffusion curve is generated by inputting 
parameters representative of viral load studies for the therapy of interest, 
generating a preliminary viral diffusion curve based upon the viral load studies 
and then calibrating a degree of directional causality in the preliminary viral 
diffusion curve to yield the viral diffusion curve. Each dot spectrogram is mapped 
to the viral diffusion curve using fractal filtering by generating a partitioned 
iterated fractal system (IFS) model representative of the dot spectrogram, 
determining affine parameters for IFS model, and then mapping the dot spectrogram 
onto the viral diffusion curve using the IFS. Before the dot spectrogram is mapped 
to the viral diffusion curve, the dot spectrogram is interf erometrically enhanced. 
After the mapping, any uncertainty in the filtered viral diffusion curve is 
compensated for using non-linear information filtering. A method is also described 
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for determining the viral load within a biological sample by analyzing a dot 
spectrogram generated for the sample in connection with viral diffusion curves 
associated with a therapy of interest. Thus a technique is provided for detecting 
and tracking infections such as viral infections and establishing clinical 
endpoints, based on accurate biomolecular measurements of viral DNA or RNA in 
peripheral blood. The technique also provides a computational protocol for 
leveraging a clinical marker to establish and track therapy effectiveness based on 
quantification of amplified nucleic acid i.e., DNA and RNA assays. The technique 
can potentially amplify dynamic ranges over 100. times, or more over conventional 
assays. The technique enables point-of -care viral load detection biosensors to 
reliably predict the likelihood of disease progression and thereby allows the 
patient to make earlier and more effective decisions about treatment. 
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Detail Description Paragraph : 

[0190] Specifically, we include the dynamic NIF correction function to the gradient 
of the VDC at the sample point normalized in a manner such that when the 
information uncertainty is null, the correction term vanishes. As discussed in the 
above steps, the NIF correction terms is actually derived from the noise statistics 
of the microarray sample. 
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ABSTRACT : 



The invention relates to computer-based systems and methods for the design, 
comparison and analysis of genetic and proteomic databases. In a particular 
embodiment, the recited systems and methods have been implemented in a computer 
tool called ARROGANT. ARROGANT, in the analysis mode, is a comprehensive tool for 
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providing annotation to large gene and protein collections. ARROGANT takes in a 
large collection of sequence identifiers and associates it with other information 
collected from many sources like sequence annotations, pathways, homology, 
polymorphisms, artifacts, etc. The simultaneous annotation for a large assembly of 
genes makes the collection of genomic/EST sequences truly informative. 
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Summary of Invention Paragraph : 

[0008] The production of DN A microarrays can be divided into four stages: a. 
Selection of array elements and design of the probe DNA; b. Preparation of the 
probe DNA; c. Preparation of a suitable design substrate to spot the probes on; d. 
Deposition of array elements. The selection of array elements for microarrays 
involves assembling a large gene collection. It would be very valuable if the same 
tool (to compile a large gene collection) could be used to further design primers, 
look for commercially available clones (expression microarrays ) and design 
resequencing probes ( resequencing microarrays ) . Once the genes are spotted on the 
microarray and hybridized to fluorescent labeled probes, there are a number of 
software programs that help in conversion of the fluorescence of the scanned image 
to numbers, using complex mathematical corrections to extract signal from 
background noise , e.g. Genepix (http://www.axon.com/GN_GenePixSoftware.html) and 
ArrayVision (http://imaging.brocku.ca/products/Arrayvision.htm) . These numbers 
indicate level of expression. Other programs such as GeneSpring (Silva et al, HMS 
Beagle: The BioMedNet Magazine Issue 82, 2000), Cluster Treeview (Eisen M B et al, 
Proc Natl Acad Sci USA 95) and Spotfire (http://www.spotfire.com), help in the 
analysis by clustering the data together using various methods based on K-means, 
hierarchal or self-organizing maps. Clustering algorithms use the expression level 
data to group the various elements on the array. It would also be very useful to 
view the elements of the array with their complete annotation and overlay the 
expression level data on top of it. The data could further be selectively viewed by 
sorting on various annotation fields and the expression level data. This approach 
could be useful to view any large gene collection in general. With the increasing 
number of microarray experiments, it would be valuable to compare elements between 
different microarrays considering that fragments of the same gene might be 
represented by different sequence identifiers . For example, two different accession 
numbers might belong to the same UniGene cluster, representing the same gene. An 
artifact sometimes observed in the results obtained from an expression profiling 
microarray experiment is that some sequences might hybridize to other sequences to 
which they are significantly similar. This leads to false positive results after a 
microarray experiment. Although Human Cot DNA is often used to prevent non-specific 
hybridization by blocking simple repetitive elements in genomic DNA, as shown in 
experiments to study cross-hybridization, Human Cot DNA is not very effective in 
preventing cross hybridization. ARROGANT computationally estimates the amount of 
cross hybridization for each sequence and tags potential genes as possible 
candidates for cross hybridization. 
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PGPUB-DOCUMENT-NUMBER : 20030013109 
PGPUB- FILING-TYPE : new 

DOCUMENT- IDENTIFIER : US 20030013109 Al 

TITLE: Hairpin sensors using quenchable fluorescing agents 
PUBLICATION-DATE: January 16, 2003 



INVENTOR-INFORMATION : 
NAME 

Ballinger, Clinton T. 
LoCascio, Michael 
Landry, Daniel P. 



CITY 

Burnt Hills 
Albany 
Clifton Park 



STATE 
NY 
NY 
NY 



COUNTRY 

US 

US 

US 



RULE- 4 7 



US-CL-CURRENT: 435/6; 435/287.2 



ABSTRACT : 



The present invention provides for a device and method for detecting genetic 
material. The device includes at least one hairpin sensor or, preferably two or 
more hairpin sensors, spatially and/or spectrally multiplexed on a conductive or 
semi-conductive substrate or particle. The at least one hairpin sensor includes a 
quenchable fluorescing agent bound to a hairpin loop assembly and the hairpin loop 
assembly includes a probe complementary to a nucleotide sequence of interest. The 
method includes providing at least one hairpin sensor, exposing the at least one 
hairpin sensor to a sample of interest, and detecting fluorescence produced by the 
quenchable fluorescing agent. The fluorescence indicates the binding of a target 
nucleotide sequence to the complementary probe of the hairpin loop assembly. 
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TITLE: Hairpin sensors using quenchable fluorescing agents 
Summary of Invention Paragrap h: 

[0008] The biochip industry has recently started using miniaturization and 
integration, similar to computer chip manufacturers, to develop entire assay 
systems on a single support. These microarrays or "labs on a chip" have been used 
to revolutionize genomics, drug development, clinical diagnostics and environmental 
monitoring in much the same way microprocessors revolutionized the computer 
industry. These microarrays give higher throughput, lower cost, portability and 
automation than the traditional bio-chemical assay methods. Because these biochips 
often have used traditional fluorescing dyes, however, they have allowed for only 
limited spectral multiplexing because of the high background noise and broad 
emission spectra, for example, of fluorescing dyes. Further, prior microarrays have 
provided limited spatial multiplexing as the number of molecules that can be 
identified in a single assay is limited by the number of dif f erentiable locations 
on the array. 
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PGPUB- FILING-TYPE : new 

DOCUMENT- IDENTIFIER: US 20020169562 Al 

TITLE: Defining biological states and related genes, proteins and patterns 
PUBLICATION-DATE: November 14, 2002 
INVENTOR- INFORMATION : 

NAME CITY STATE COUNTRY RULE- 4 7 



Stephanopoulos, Gregory Chester MA US 

Misra, Jatin Cambridge MA US 

Hwang, Daehee Cambridge MA US 

Schmitt, William A. JR. Boston MA US 
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Silva, Saliya Sudharshana Kandy CO LK 
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US-CL-CURRENT: 702/19; 435/6, 530/350, 536/23. 1 



ABSTRACT : 

Disclosed are a variety of methods and computer systems for use in the analysis of 
gene and protein expression data. Also disclosed are methods for the definition of 
the cellular state of cells and tissues from multidimensional physiological data 
such as those obtained from gene expression measurements with DNA microarrays. A 
variety of classification methods can be applied to expression data to achieve this 
goal. Demonstrated is the application of several statistical tools including Wilks 1 
lambda ratio of within-group to total variance, Fisher Discriminant Analysis, and 
the misclassification error rate to the identification of discriminating genes and 
the overall classification of expression data. Examples from several different 
cases demonstrate the ability of the method to produce well-separated groups in the 
projection space representing distinct physiological states. The method can be 
augmented and is useful in disease diagnosis, drug screening and bioprocessing 
applications. 
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TITLE: Defining biological states and related genes, proteins and patterns 
Detail Description Paragraph : 

[0301] We applied FDA projections to four examples of gene expression phenotypes 
generated in our laboratory and also published in the literature. In the first 
example, cultures of the photosynthetic bacterium Synechocystis sp . PCC 6803 were 
cultivated through an initial period of 48 hours of growth under light followed by 
24 hours of darkness. The cultures where then cycled between light and dark 
conditions for 100 minutes each (FIG. 4) . The expression levels of 88 genes 
associated with harvesting of light energy and central carbon metabolism were 
measured at 23 time points (29 total samples, including duplicates) using DNA 
microarrays. Gill, R. T., E. Katsoulakis, W. Schmitt, G. Taroncher-Oldenburg and G. 
Stephanopoulos, "Dynamic transcriptional profiling of the light to dark transition 
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in Synechocystis sp. PCC6803, " (submitted) (2000). Total signal to noise ratio of 
the microarray fluorescence was determined to be c.a. 4.0 indicating that 
background noise minimally interfered with the fluorescence of hybridized spots. 
Reproducibility of expression measurements, evaluated fro m microarray to microarray 
measurements, as well as from intra -microarray triplicate spots, was 45% suggesting 
that a 90% difference in fluorescence is reproducible within 95% confidence level 
Of the 88 total genes considered, 27 discriminatory genes were identified based on 
their Wilks 1 lambda measure with a stringent 99% confidence level. Dillon, W.R., 
and M. Goldstein. Multivariate Analysis, John Wiley & Sons. (1984). FIG. 4 shows the 
projection of the expression phenotype of the 27 Synechocystis discriminatory genes 
to the FDA-defined 3-D space. Three dimensions were used in this projection to 
distinguish the four phenotypic classes of growth under the light and dark 
conditions shown in FIG. 4. The class separation can also be seen in 2-D diagrams 
of the above canonical variables (FIG. 4c) . CVI distinguishes group 2 from the 
other groups while CV2 separates groups I and 3. Hence, the second CV loadings 
provide information on the identity of the genes supporting the differences in the 
cellular processes occurring under light and dark conditions. 
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PGPUB- DOCUMENT-NUMBER : 20020155480 
PGPUB-FILING-TYPE: new 

DOCUMENT-IDENTIFIER: US 20020155480 Al 

TITLE: Brain tumor diagnosis and outcome prediction 

PUBLICATION-DATE: October 24, 2002 



INVENTOR-INFORMATION: 
NAME 

Golub, Todd R. 
Lander, Eric S. 
Pomeroy, Scott 
Tamayo, Pablo 



CITY 

Newton 

Cambridge 

Newton 

Cambridge 



STATE 

MA 

MA 

MA 

MA 



COUNTRY 

US 

US 

US 

US 



RULE-47 



US - CL- CURRENT : 435/6 



ABSTRACT: 



Methods for predicting phenotypic classes of brain tumors, such as brain tumor type 
or treatment outcome, for brain tumor samples based on gene expression profiles are 
described. 
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Detail Description Paragraph : 

[0069] Data Analysis: Supervised Learning. Genes correlated with particular class 
distinctions (e.g., classic vs. desmoplastic medulloblastoma ) were identified by 
sorting all of the genes on the array according the signal-to -noise statistic 

( .mu. . sub . 0- .mu. . sub. 1) /(. sigma. . - sub . 0+ . sigma . . sub . 1 ) , where .mu. and .sigma. 
represent the median and standard deviation of expression, respectively, for each 
class. Similar results were obtained using a standard t-statistic as the metric 

{ ( .mu. . sub. 0- .mu. . sub. 1) /sqrt ( . sigma. .sub. 0 . sup . 2/N0+ . sigma . . sub. 1 . sup .2/- 
N.sub.l)), where N represents the number of samples in each class (see 
Supplementary Information; http://www.genome.wi.mit.edu/MPR/CNS). Permutation of 
the column (sample) labels was performed to compare these correlations to what 
would be expected by chance in 99% of the permutations .- For classification, a 
modification of the k-NN algorithm was developed that predicts the class of a new 
data point by calculating the Euclidean distance (d) of the new sample to the k 
nearest samples (for these experiments, k=5) in the training set using normalized 
gene expression data, and selecting the class to be that of the majority of the k 
samples. The weight given to each neighbor was 1/d. The k-NN models were evaluated 
by 60-fold leave-one-out cross-validation whereby a training set of 59 samples was 
used to predict the class of a randomly withheld sample, and the cumulative error 
rate was recorded. Models with variable numbers of genes (1-200, selected according 
to their correlation with the survivor vs. treatment failure distinction in the 
training set) were tested in this manner. An 8-gene k-NN outcome prediction model 
yielded the lowest error rate, and was therefore used to generate Kaplan-Meier 
survival plots using S-Plus. Predictors using metastatic staging or TrkC were 
constructed by finding the decision boundary half way between the classes: 

( .mu. . sub.classO+.mu. . sub.classl) 12 using either the staging values 0 vs. 1, 2, 3, 
4 or the continuous TrkC microarray gene expression levels, and then predicting the 
unknown sample according to its location with respect to that boundary. 
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ABSTRACT : 
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The system can be used for the automatic analysis of images (I), comprising a 
matrix of spots, such as images of DNA microarrays after hybridisation. The system 
can be associated — and preferably integrated in a single monolithic component 
implementing VLSI CMOS technology — to a sensor (10) for acquiring said images (I) . 
The system comprises a circuit (20) for processing the signals corresponding to the 
images (I), configured according to a cellular neural network (CNN) architecture 
for the parallel analogue processing of signals. 
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TITLE: System for the automatic analysis of images such as DNA microarray images 



Brief Description of Drawings Paragraph : 

[0047] FIG. 8 (which is split into two parts, identified by 8a and 8b, 
respectively) and figures from 9 to 12 illustrate, for example, the method 
according to which the various operations concerning filtering, segmenting, and the 
morphological operations, which can be implemented in a system according to this 
invention, can be conducted to isolate useful information with respect to the 
various sources of noise which could lead to false interpretations of the results 
during the automatic microarray image analysis process. 
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US 
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US-CL-CURRENT: 435/6; 435/32, 435/4 



ABSTRACT : 



The invention features methods of high throughput screening of candidate drug 
agents and rapid identification of drug targets by examining induction of the 
stress response in a host cell, e.g., the stress response in wildtype host cells 
and/or in host cells that differ in target gene product dosage (e.g., host cells 
that have two copies of a drug target gene product-encoding sequence relative to 
one copy) . In general, induction of the stress response in wildtype host cells 
indicates that a candidate agent has activity of the drug. Induction of a 
relatively lower or undetectable stress response in a host cell comprising an 
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alteration in gene product dosage indicates that the host cell is drug-sensitive 
and is altered in a gene product that plays a role in resistance to the drug. 
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TITLE: Identification of drugs and drug targets by detection of the stress response 
Summary of Invention Paragraph : 

[0019] Another advantage of the invention is that the methods of the invention 
exploit the stress response, which is normally regarded as unavoidable "background 
noise, " to rapidly identify candidate agents having drug activity and, where 
desired, identify the target gene products of the drug activity. Importantly, this 
stress response often clouds interpretation of microarray expression experiments, 
forcing the sensitivity to be lowered. The present invention exploits the robust 
stress response of cells to provide a sensitive metric of drug effectiveness. 
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ABSTRACT : 



The present invention relates to methods for detecting differential expression of 
embryonic gene products known to play a fundamental role in the embryonic 
developmental process using nucleic acid arrays containing Xenopus embryonic gene 
sequences as set forth in Appendix 1. This allows the detection of the expression 
of differentially expressed genes in embryonic cells, for diagnosing developmental 
disorders or identifying different types of embryonic cells. 
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Detail Description Paragraph : 

[0131] Because the clones selected for RT-PCR analysis in these experiments were 
detected at much lower levels than are detected at much lower levels than the genes 
analyzed in Example 4, above, and therefore have a lower signal intensity on 
microarrays, background noise is a greater factor in analyzing the data. Thus, 
analysis of total cellular mRNA using the microarrays of this invention are useful 
for identifying candidate genes whose expression is spatially restricted in early 
vertebrate embryos. Such candidate genes can then be confirmed, e.g., using more 
sensitive methods such as the RT-PCR techniques described here, by hybridizing 
polyA.sup.+ RNA samples from cells to microarrays, or by using microarrays with 
more specific and sensitive probes for these candidate genes. 
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ABSTRACT : 



The disclosed scanning structure includes an apparatus for light delivery and light 
receiving from a light-excitable area on a substrate to be measured by the scanning 
structure. The light delivery and receiving apparatus may include an optical fiber 
having a proximal end and a distal end which transmits light having a certain 
wavelength or light with several varying wavelengths to excite the substrate 
samples. This optical fiber may also simultaneously receive light which may be 
emitted by fluorescing samples on the substrate. The scanning structure also may 
further include a holder for the optical fiber that is able to transverse variable 
distances over the substrate to be measured or examined. Holders may include 
galvano scanners as well as resonating suspension beams. A light source, e.g., a 
laser, may be optically coupled to the optic fiber's proximal end. And this light 
source may be of a certain wavelength, but multiple light sources each having a 
different wavelength may also be used simultaneously by coupling the light sources 
into either a single optic fiber through wavelength multiplexers or by placing 
individual optic fibers carrying differing wavelengths in close proximity to each 
other. As the light is transmitted down to the substrate through the optic fiber, 
the fiber is sufficiently close to the substrate microarray so that it can also 
receive the emitted fluorescing light. 
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DOCUMENT-IDENTIFIER: US 20020037149 Al 
TITLE: Fiber optic scanner 



Summary of Invention Paragraph : 

[0007] In the CCD based system, the imager and lens are both the main cost drivers. 
Here, the excitation light is expanded to a large area causing a great reduction in 
energy density. The exposure time has to be extended several tens of seconds to 
compensate for this reduction. Because of the long exposure time, the CCD imager 
has to be cooled to maintain a reasonable single to noise ratio. Such a cooled, 
large format CCD imager is very expensive at present. In addition, the optical lens 
in the system has to be corrected for chromatic aberrations and image distortions 
over a large field of view, which significantly increases its cost in comparison to 
the lens in the point-to-point system in a scanning microscope. Furthermore, there 
is currently no CCD imager with sufficient resolution and format to cover the 
entire area of a slide (25 mm. times. 75 mm). Mechanical scanning is still required 
for CCD based systems, which reduces the speed of the reader. At present, most 
microarray readers on the market require 5 minutes or more to complete the scanning 
of a 25 mm. times. 70 mm slide, which is too slow for many applications. 
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ABSTRACT : 

Microarrays can measure the expression of thousands of genes and thus identify 
changes in expression between different biological states. Methods are needed to 
determine the significance of these changes, while accounting for the enormous 
number of genes. We describe a new method, Significance Analysis of Microarrays 
(SAM) , that assigns a score to each gene based on the change in gene expression 
relative to the standard deviation of repeated measurements. For genes with scores 
greater than an adjustable threshold, SAM uses permutations of the repeated 
measurements to estimate the percentage of such genes identified by chance, the 
false discovery rate (FDR) . When the transcriptional response of human cells to 
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ionizing radiation was measured by microarrays, SAM identified 34 genes that 
changed at least 1.5-fold with an estimated FDR of 12%, compared to FDRs of 60% and 
84% using conventional methods of analysis. Of the 34 genes, 19 were involved in 
cell cycle regulation, and 3 in apoptosis. Surprisingly, 4 nucleotide excision 
repair genes were induced, suggesting that this repair pathway for UV-damaged DNA 
might play a heretofore unrecognized role in repairing DNA damaged by ionizing 
radiation. 
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Detail Description Paragraph : 

[0062] As noted above, factors inherent in the process of acquisition of microarray 
data itself may introduce noise that renders it difficult to discover the 
significance of differences in gene expression or other biological behavior or 
falsely identify genes to be of statistical significance. To overcome such problem, 
a number of methods are described above which allow full utilization of the 
microarray data. One difficulty in making use of the microarray data is due to the 
fact that the expression levels of the genes have a wide range of values or 
scattered values. It is, therefore, desirable to adjust the parameter d{i) so that 
it is essentially independent of the wide variation of the values of the parameter 
d(i) and/or of the scatter value s(i). After the parameter has been so adjusted, 
then all of the data can be fully utilized. 
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INVENTOR- INFORMATION: 
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US-CL-CURRENT: 435/6; 356/319, 356/326 
ABSTRACT: 

The present invention provides a hyperspectral imaging apparatus and methods for 
employing such an apparatus for multi-dye/base detection of a nucleic acid molecule 
coupled to a solid surface. 
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DOCUMENT-IDENTIFIER: US 20020009744 Al 

TITLE: In-line complete spectral fluorescent imaging of nucleic acid molecules 



Detail Description Paragraph : 

[0046] The present invention offers a significant advantage over traditional gel- 
based mutation detection methods in the areas of through-put, cost, reliability and 
operational simplicity. The present invention can accurately identify heterozygous 
mutations. Preferably, analysis of both strands is performed to reduce potential 
mis-calling. Under another preferred embodiment, genotyping of wild type DNA (as a 
reference strand) for comparison of background noise levels is performed to improve 
the ability to accurate identify "true" heterozygotes . The present invention can be 
used to detect the incorporation of labeled dye terminators either in solution, or 
on microarrays . 
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US-CL-CURRENT: 702/27; 422 / 82.05 , 435/6, 435/ 91.1 , 435 / 91.2 , 712/200 
ABSTRACT: 

A technique is described for quantitating biological indicators, such as viral 
load, using interf erometric interactions such as quantum resonance interf erometry . 
A biological sample is applied to an array information structure that has a 
plurality of elements that emit data indicative of viral load. A digitized output 
pattern of the arrayed information structure is interf erometrically enhanced by 
generating interference between the output pattern and a reference wave. The 
interferometrically enhanced output pattern is then analyzed to identify emitted 
data indicative of viral load which in turn is used to determine viral load. 



21 Claims, 5 Drawing figures 
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TITLE: Technique for quantiating biological markers using quantum resonance 
in t erf erometry 

Detailed Description Text (167): 

Specifically, we include the dynamic NIF correction function to the gradient of the 
VDC at the sample point normalized in a manner such that when the information 
uncertainty is null, the correction term vanishes. As discussed in the above steps, 
the NIF correction terms is actually derived from the noise statistics of the 
microarray sample. 
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ABSTRACT : 

A technique is disclosed that is useful for determining the presence of specific 
hybridization expression within an output pattern generated from a digitized image 
of a biological sample applied to an arrayed platform. The output pattern includes 
signals associated with noise, and signals associated with the biological sample, 
some of which are degraded or obscured by noise. Signal processing, such as 
interferometry, or more specifically, resonance interf erometry, and even more 
specifically quantum resonance interferometry or stochastic resonance 
interferometry, is used to amplify signals associated with the biological sample 
having an intensity lower than the intensity of signals associated with noise so 
that they may be clearly distinguished from background noise. The improved 
detection technique allows rapid, reliable, and inexpensive measurements of arrayed 
platform output patterns. 
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Exemplary Claim Number: 1 
Number of Drawing Sheets: 5 
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TITLE: Method and system for signal detection in arrayed instrumentation based on 
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quantum resonance interf erometry 
Brief Summary Text (9) : 

Accordingly, it would highly desirable to provide an improved method and apparatus 
for analyzing the output of the DNA microarray to more expediently, reliably, and 
inexpensively determine the presence of any medical conditions or concerns within 
the patient providing the DNA sample. It is particularly desirable to provide a 
technique that can identify mutation signatures within dot spectrograms even in 
circumstance wherein the signal to noise ration is extremely low. It is to these 
ends that aspects of the invention are generally drawn. 



□ 40. DocumentID: US6633659B1 

L3: Entry 40 of 53 File: USPT Oct 14, 2003 

US-PAT-NO: 6633659 

DOCUMENT-IDENTIFIER: US 6633659 Bl 

TITLE: System and method for automatically analyzing gene expression spots in a 
microarray 

DATE-ISSUED: October 14, 2003 
INVENTOR- INFORMATION: 

NAME CITY STATE ZIP CODE COUNTRY 

Zhou; Yi-Xiong Los Angeles CA 

US -CL- CURRENT: 382/129; 435/6, 702/21 



ABSTRACT: 

A digital image processing-based system and method are disclosed for quantitatively 
assessing nucleic acid species expressed in a microarray. The microarray is a grid 
of a plurality of sub-grids of the nucleic acid species. The system includes a 
scanner that has a digital scanning sensor that-scans the microarray and transmits 
from an output a digital image of the microarray, and a computer that receives the 
digital image of the microarray from the scanner and then processes the digital 
image, detecting an expression signal of the nucleic acid species, segmenting the 
expression signal, calculating a measure of the segmented expression signal, and 
providing the measure at the output of the computer. Prior to segmenting the 
expression signal for a nucleic acid species, the expression signal is 
characterized by a center pixel in the digital image and an approximate radius 
around the center pixel. The computer segments the expression signal by (a) 
tentatively classifying pixels within the approximate radius as signal pixels and 
those outside the approximate radius as background pixels, (b) determining major 
intensity modes for the signal pixels and for the background pixels, and (c) using 
the major intensity modes, reclassifying the signal and background pixels depending 
on each pixel 1 s intensity relative to the major intensity modes. 

62 Claims, 20 Drawing figures 
Exemplary Claim Number: 1 
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Number of Drawing Sheets: 14 
L3: Entry 40 of 53 



File: USPT 



Oct 14, 2003 



DOCUMENT-IDENTIFIER: US 6633659 Bl 

TITLE: System and method for automatically analyzing gene expression spots in a 
microarray 



Detailed Description Text (20) : 

After the spatial region for each sub-grid has been partitioned, the automatic sub- 
grid detection process 501 proceeds to identify, as shown in FIG. 7 , the rows and 
columns for each sub-grid of step 702. Again, the steps to find the rows in a sub- 
grid are preferably essentially the same as the steps for finding the columns. 
Thus, only the process for identifying columns in a sub-grid is outlined below. 
FIG. 9 depicts steps for a preferred row/column detection process 702. In the first 
step 900, all of the pixels in a sub-grid region along the vertical dimension are 
summed to form a one-dimensional horizontal vector. Next, an "averaging" or low 
pass filter whose width is equal to the expected diameter of the spots is applied 
to the vector in step 902. This averaging step 902 is performed because the image 
of each sub-grid region is smaller than the overall microarray image that was 
processed in the previous sub-grid region-locating step 700. By applying the 
averaging filter, the noise that is inherent in a typical microarray image is 
reduced. Next, the maxima or peaks in the horizontal vector are determined in step 
904, again using a maximum filter in which the size of the maximum filter is 
preferably equal to the expected spot size. In the next step 906, using the 
previously calculated mode distance M to establish additional peak locations, peaks 
are added to the vector to fill the length of the sub-grid region. The resulting 
peak locations specify the locations of the columns in the sub-grid region. The 
previous steps for detecting the columns in a sub-grid region are repeated 908 to 
determine the locations of the rows in the sub-grid or vice versa. Finally, a check 
step 910 is performed to determine whether the number of peaks for each vector is 
at least as high as expected. For the horizontal vectors, the number of peaks 
should equal or be greater than the expected number of columns in a sub-grid. For 
vertical vectors, the number of peaks should be equal to or greater than the number 
of rows in a sub-grid. If the number of peaks is less than expected for a 
horizontal or vertical vector, then the process exits at step 912, having not 
performed successfully. If the number of peaks for a given vector is equal to or 
greater than the expected number, then the process exits at step 914 with the row 
and column detection process 702 being considered successful. With a successful 
completion of this process, the rows and columns define candidate sub-grids with 
grid-point intersections in each sub-grid region of the microarray . 



US-PAT-NO: 6495363 

DOCUMENT- IDENTIFIER: US 6495363 B2 

TITLE: In-line complete spectral fluorescent imaging of nucleic acid molecules 
DATE-ISSUED: December 17, 2002 
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INVENTOR-INFORMATION : 

NAME CITY STATE ZIP CODE COUNTRY 

Bogdanov; Valery Baltimore MD 

US-CL-CURRENT: 435/ 287.2 ; 435/6, 435/7.1, 435 / 91.2 , 536/ 22.1 , 536/ 23.1 , 536/ 24.3 , 
536 / 24.31 , 536 / 24.32 , 536 / 24.33 



ABSTRACT: 

The present invention provides a hyperspectral imaging apparatus and methods for 
employing such an apparatus for multi-dye/base detection of a nucleic acid molecule 
coupled to a solid surface. 

10 Claims, 8 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 7 

L3: Entry 41 of 53 File: USPT Dec 17, 2002 



DOCUMENT-IDENTIFIER: US 6495363 B2 

TITLE: In-line complete spectral fluorescent imaging of nucleic acid molecules 



Detailed Description Text (4) : 

The present invention offers a significant advantage over traditional gel-based 
mutation detection methods in the areas of through-put, cost, reliability and 
operational simplicity. The present invention can accurately identify heterozygous 
mutations. Preferably, analysis of both strands is performed to reduce potential 
mis-calling. Under another preferred embodiment, genotyping of wild type DNA (as a 
reference strand) for comparison of background noise levels is performed to improve 
the ability to accurate identify "true" heterozygotes . The present invention can be 
used to detect the incorporation of labeled dye terminators either in solution, or 
on microarrays . 



. : ;:i; : ;;i/:y': ; -;;ii ;: ;;:;;^[: : ;::;,;;:: r"L: ;; 



□ 42. Document ID: US 6245511 Bl 

L3: Entry 42 of 53 



File: USPT 



Jun 12, 2001 



US -PAT-NO : 6245511 

DOCUMENT-IDENTIFIER: US 6245511 Bl 

** See image for Certificate of Correction ** 



TITLE: Method and apparatus for exponentially convergent therapy effectiveness 
monitoring using DNA microarray based viral load measurements 

DATE-ISSUED: June 12, 2001 



INVENTOR- INFORMATION: 
NAME 

Gulati; Sandeep 



CITY 

La Canada 



STATE 
CA 



ZIP CODE 
91011 



COUNTRY 
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US-CL-CURRENT: 435/6; 435/ 91.2 , 536/ 24.3 



ABSTRACT : 



A technique is described for determining the effectiveness of medical therapy and 
dosage formulations by analyzing dot spectrograms representative of quantized 
hybridization activity in biological samples, such as DNA, RNA, or other protein 
biomolecular array samples, taken at different times from a patient undergoing the 
medical therapy. This technique enables disease progression analysis based on 
surrogate markers such as viral load. In accordance with the technique, a viral 
diffusion curve associated with a therapy of interest is generated and each dot 
spectrogram is then mapped to a viral diffusion curve using fractal filtering. 
Next, degree of convergence towards the peak of VDC, between the sample points on a 
filtered viral diffusion curve is determined. The technique allows for point-of- 
care viral load detection biosensors to accurately and reliably predict the 
likelihood of disease progression. 



46 Claims, 5 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 5 

L3: Entry 42 of 53 



File: USPT 



Jun 12, 2001 



DOCUMENT-IDENTIFIER: US 6245511 Bl 

** See image for Certificate of Correction ** 

TITLE: Method and apparatus for exponentially convergent therapy effectiveness 
monitoring using DNA microarray based viral load measurements 

Detailed Description Text (173) : 

Specifically, we include the dynamic NIF correction function to the gradient of the 
VDC at the sample point normalized in a manner such that when the information 
uncertainty is null, the correction term vanishes. As discussed in the above steps, 
the NIF correction terms is actually derived from the noise statistics of the 
microarray sample. 
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□ 43. Document ID: US 6245507 Bl 

L3: Entry 43 of 53 



File: USPT 



Jun 12, 2001 



US-PAT-NO: 6245507 

DOCUMENT- IDENTIFIER : US 6245507 Bl 

TITLE: In-line complete hyperspectral fluorescent imaging of nucleic acid molecules 
DATE-ISSUED: June 12, 2001 



INVENTOR- INFORMATION : 
NAME 

Bogdanov; Valery 



CITY 

Baltimore 



STATE ZIP CODE 

MD 



COUNTRY 



US-CL-CURRENT: 435/6; 204/450, 204/461, 356/300, 356/301, 356/302, 356/303, 
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356/344, 435 / 91. 1 , 435 / 91.2 , 536/24.3, 536 / 24.31 , 536/24.33 



ABSTRACT: 

The present invention provides a hyperspectral imaging apparatus and methods for 
employing such an apparatus for multi-dye/base detection of a nucleic acid molecule 
coupled to a solid surface. 

14 Claims, 8 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 7 

L3: Entry 43 of 53 File: USPT Jun 12, 2001 



DOCUMENT-IDENTIFIER: US 6245507 Bl 

TITLE: In-line complete hyperspectral fluorescent imaging of nucleic acid molecules 



Detailed Description Text (4) : 

The present invention offers a significant advantage over traditional gel-based 
mutation detection methods in the areas of through-put, cost, reliability and 
operational simplicity. The present invention can accurately identify heterozygous 
mutations. Preferably, analysis of both strands is performed to reduce potential 
mis-calling. Under another preferred embodiment, genotyping of wild type DNA (as a 
reference strand) for comparison of background noise levels is performed to improve 
the ability to accurate identify "true" heterozygotes . The present invention can be 
used to detect the incorporation of labeled dye terminators either in solution, or 
on microarrays . 



□ 44. Document ID: US 6142681 A 

L3: Entry 44 of 53 



File: USPT 



Nov 7, 2000 



US-PAT-NO: 6142681 

DOCUMENT-IDENTIFIER: US 6142681 A 

TITLE: Method and apparatus for interpreting hybridized bioelectronic DNA 
microarray patterns using self-scaling convergent reverberant dynamics 

DATE-ISSUED: November 7, 2000 



INVENTOR-INFORMATION: 
NAME 

Gulati; Sandeep 



CITY 

La Canada 



STATE 
CA 



ZIP CODE 



COUNTRY 



US-CL-CURRENT: 702/19; 382/129, 435 / 287.2 , 435/288.7, 435/6, 435/71.1, 435/ 91.1 , 
436/173, 536 / 25.3 , 536/ 25.4 , 702/20 



ABSTRACT : 



A technique is described for identifying mutations, if any, present in a biological 
sample, from a pre-selected set of known mutations. The method can be applied to 
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DNA, RNA and peptide nucleic acid (PNA) microarrays . The method analyzes a dot 
spectrogram representative of quantized hybridization activity of oligonucleotides 
in the sample to identify the mutations. In accordance with the method, a resonance 
pattern is generated which is representative of nonlinear resonances between a 
stimulus pattern associated with the set of known mutations and the dot 
spectrogram. The resonance pattern is interpreted to a yield a set of confirmed 
mutations by comparing resonances found therein with predetermined resonances 
expected for the selected set of mutations. In a particular example, the resonance 
pattern is generated by iteratively processing the dot spectrogram by performing a 
convergent reverberation to yield a resonance pattern representative of resonances 
between a predetermined set of selected Quantum Expressor Functions and the dot 
spectrogram until a predetermined degree of convergence is achieved between the 
resonances found in the resonance pattern and resonances expected for the set of 
mutations. The resonance pattern is analyzed to a yield a set of confirmed 
mutations by mapping the confirmed mutations to known diseases associated with the 
pre-selected set of known mutations to identify diseases, if any, indicated by the 
biological sample. By exploiting a resonant interaction, mutation signatures may be 
robustly identified even in circumstances involving low signal to noise ratios or, 
in some cases, negative signal to noise ratios. 

31 Claims, 4 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 6 

L3: Entry 44 of 53 File: USPT Nov 7, 2000 



DOCUMENT-IDENTIFIER: US 6142681 A 

TITLE: Method and apparatus for interpreting hybridized bioelectronic DNA 
microarray patterns using self-scaling convergent reverberant dynamics 

Abstract Text (1) : 

A technique is described for identifying mutations, if any, present in a biological 
sample, from a pre-selected set of known mutations. The method can be applied to 
DNA, RNA and peptide nucleic acid (PNA) microarrays . The method analyzes a dot 
spectrogram representative of quantized hybridization activity of oligonucleotides 
in the sample to identify the mutations. In accordance with the method, a resonance 
pattern is generated which is representative of nonlinear resonances between a 
stimulus pattern associated with the set of known mutations and the dot 
spectrogram. The resonance pattern is interpreted to a yield a set of confirmed 
mutations by comparing resonances found therein with predetermined resonances 
expected for the selected set of mutations. In a particular example, the resonance 
pattern is generated by iteratively processing the dot spectrogram by performing a 
convergent reverberation to yield a resonance pattern representative of resonances 
between a predetermined set of selected Quantum Expressor Functions and the dot 
spectrogram until a predetermined degree of convergence is achieved between the 
resonances found in the resonance pattern and resonances expected for the set of 
mutations. The resonance pattern is analyzed to a yield a set of confirmed 
mutations by mapping the confirmed mutations to known diseases associated with the 
pre-selected set of known mutations to identify diseases, if any, indicated by the 
biological sample. By exploiting a resonant interaction, mutation signatures may be 
robustly identified even in circumstances involving low signal to noise ratios or, 
in some cases, negative signal to noise ratios. 

Brief Summary Text (12) : 

Accordingly, it would highly desirable to provide an improved method and apparatus 
for analyzing the output of the DNA microarray to more expediently, reliably, and 
inexpensively determine the presence of any conditions within the patient providing 
the DNA sample. It is particularly desirable to provide a technique that can 
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identify mutation signatures within dot spectrograms even in circumstance wherein 
the signal to noise ration is extremely low. It is to these ends that aspects of 
the invention are generally drawn. 



□ 45. Document ID: US 6136541 A 

L3: Entry 45 of 53 File: USPT Oct 24, 2000 

US- PAT-NO : 6136541 

DOCUMENT-IDENTIFIER: US 6136541 A 

TITLE: Method and apparatus for analyzing hybridized biochip patterns using 
resonance interactions employing quantum expressor functions 

DATE-ISSUED: October 24, 2000 
INVENTOR-INFORMATION: 

NAME CITY STATE ZIP CODE COUNTRY 

Gulati; Sandeep La Canada CA 

US-CL-CURRENT: 435/6; 382/129, 536/ 24.3 , 536 / 24.31 , 536 / 24.32 , 702/19, 702/20 
ABSTRACT: 

A technique is described for identifying mutations, if any, present in a biological 
sample, from a pre-selected set of known mutations. The method can be applied to 
DNA, RNA and peptide nucleic acid (PNA) microarrays . The method analyzes a dot 
spectrogram representative of quantized hybridization activity of oligonucleotides 
in the sample to identify the mutations. In accordance with the method, a resonance 
pattern is generated which is representative of nonlinear resonances between a 
stimulus pattern associated with the set of known mutations and the dot 
spectrogram. The resonance pattern is interpreted to a yield a set of confirmed 
mutations by comparing resonances found therein with predetermined resonances 
expected for the selected set of mutations. In a particular example, the resonance 
pattern is generated by iteratively processing the dot spectrogram by performing a 
convergent reverberation to yield a resonance pattern representative of resonances 
between a predetermined set of selected Quantum Expressor Functions and the dot 
spectrogram until a predetermined degree of convergence is achieved between the 
resonances found in the resonance pattern and resonances expected for the set of 
mutations. The resonance pattern is analyzed to a yield a set of confirmed 
mutations by mapping the confirmed mutations to known diseases associated with the 
pre-selected set of known mutations to identify diseases, if any, indicated by the 
biological sample. By exploiting a resonant interaction, mutation signatures may be 
robustly identified even in circumstances involving low signal to noise ratios or, 
in some cases, negative signal to noise ratios. 

31 Claims, 7 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 6 

L3: Entry 45 of 53 File: USPT Oct 24, 2000 
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DOCUMENT-IDENTIFIER: US 6136541 A 

TITLE: Method and apparatus for analyzing hybridized biochip patterns using 
resonance interactions employing quantum expressor functions 



Abstract Text (1) : 

A technique is described for identifying mutations, if any, present in a biological 
sample, from a pre-selected set of known mutations. The method can be applied to 
DNA, RNA and peptide nucleic acid (PNA) microarrays . The method analyzes a dot 
spectrogram representative of quantized hybridization activity of oligonucleotides 
in the sample to identify the mutations. In accordance with the method, a resonance 
pattern is generated which is representative of nonlinear resonances between a 
stimulus pattern associated with the set of known mutations and the dot 
spectrogram. The resonance pattern is interpreted to a yield a set of confirmed 
mutations by comparing resonances found therein with predetermined resonances 
expected for the selected set of mutations. In a particular example, the resonance 
pattern is generated by iteratively processing the dot spectrogram by performing a 
convergent reverberation to yield a resonance pattern representative of resonances 
between a -predetermined set of selected Quantum Expressor Functions and the dot 
spectrogram until a predetermined degree of convergence is achieved between the 
resonances found in the resonance pattern and resonances expected for the set of 
mutations. The resonance pattern is analyzed to a yield a set of confirmed 
mutations by mapping the confirmed mutations to known diseases associated with the 
pre-selected set of known mutations to identify diseases, if any, indicated by the 
biological sample. By exploiting a resonant interaction, mutation signatures may be 
robustly identified even in circumstances involving low signal to noise ratios or, 
in some cases, negative signal to noise ratios. 

Brief Summary Text (10) : 

Accordingly, it would highly desirable to provide an improved method and apparatus 
for analyzing the output of the DNA microarray to more expediently, reliably, and 
inexpensively determine the presence of any medical conditions or concerns within 
the patient providing the DNA sample. It is particularly desirable to provide a 
technique that can identify mutation signatures within dot spectrograms even in 
circumstance wherein the signal to noise ration is extremely low. It is to these 
ends that aspects of the invention are generally drawn. 
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INT-CL (IPC) : C12 Q 1/68 
ABSTRACT: 

The present invention relates to correction method, apparatus and recording medium 
on an oligonucleotide microarray using Principal Component Analysis (PCA) . More 
particularly, the present invention relates to correction method, apparatus and 
recording medium on the oligonucleotide microarray using the correlation of probe 
set for detecting and correcting the faulty probe expression data in the outliers 
of the oligonucleotide microarray by applying PCA to each probe set of gene. Since 
the faulty probe data is corrected close to the normal value, the present invention 
makes it possible to remove the noise included in the oligonucleotide microarray, 
improve the accuracy and efficiency of chip experiment and analysis due to 
obtainment of accurate expression intensity data, and standardize the 
oligonucleotide chip data. 

L3: Entry 46 of 53 File: EPAB ' Dec 24, 2003 



DOCUMENT- IDENTIFIER : WO 3106710 Al 

TITLE: CORRECTION METHOD, APPARATUS AND RECORDING MEDIUM ON OLIGONUCLEOTIDE 
MICROARRAY USING PRINCIPAL COMPONENT ANALYSIS 

Abstract Text (1) : 

The present invention relates to correction method, apparatus and recording medium 
on an oligonucleotide microarray using Principal Component Analysis (PCA) . More 
particularly, the present invention relates to correction method, apparatus and 
recording medium on the oligonucleotide microarray using the correlation of probe 
set for detecting and correcting the faulty probe expression data in the outliers 
of the oligonucleotide microarray by applying PCA to each probe set of gene. Since 
the faulty probe data is corrected close to the normal value, the present invention 
makes it possible to remove the noise included in the oligonucleotide microarray, 
improve the accuracy and efficiency of chip experiment and analysis due to 
obtainment of accurate expression intensity data, and standardize the 
oligonucleotide chip data. 
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DOCUMENT- IDENTIFIER : WO 3035911 Al 

TITLE: CLONE IDENTIFICATION BY DIRECT DETECTION OF NUCLEIC ACID BINDING PROTEINS 
FROM VERTEBRATE CELL EXPRESSION SYSTEMS AND APPLICATIONS THEREOF 
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INT-CL (IPC) : C12 Q 1/68; C12 P 19/34; C07 H 21/02; C07 H 21/04; C07 H 19/00 
EUR-CL (EPC) : C12N015/10 

ABSTRACT: 

CHG DATE=20030603 STATUS=0>A general vertebrate cloning and screening process for 
identification of genes encoding nucleic acid binding proteins is disclosed, 
including steps of library division and protein expression, followed by a nucleic 
acid binding assay as a clone screening step that is analyzed by either an 
Electrophoretic Mobility Shift Assay (EMS A) or a chromatography assay, preferably 
in a high throughput assay format. The disclosed technology provides a complete 
clone selection system for genes of DNA and RNA binding proteins expressed in 
vertebrate cells. Included are ways to optimize vertebrate cell transformation or 
transfection conditions by measuring the nucleic acid binding function of an 
expressed control protein. Also included is a high throughput electrophoretic 
mobility shift assay (EMSA) for detection of nucleic acid binding proteins, using a 
novel application of thin layer agarose gels for EMSA, combining a high vacuum and 
high temperature blotting technique for agarose gel desiccation with the use of a 
high efficiency transfer matrix (for instance, quaternary aminated nylon membranes) 
for blotting nucleic acid-protein complexes from agarose gels. Compensation for 
sample-to-sample variations in signal-to -noise ratio in clone screening is provided 
by introducing a reference probe to detect binding of a known protein expressed in 
the vertebrate host cell, along with the tester probe with the target sequence for 
which new clones encoding binding proteins are sought. High throughput 
chromatography screening methodology for nucleic acid binding proteins is also 
disclosed, allowing automation of screening. The disclosed technology includes 
combinatorial applications providing assays for functional gene linkage groups and 
can also be applied in connection with cDNA microarray and protein chip 
technologies. 

L3: Entry 47 of 53 File: EPAB May 1, 2003 



DOCUMENT-IDENTIFIER: WO 3035911 Al 

TITLE: CLONE IDENTIFICATION BY DIRECT DETECTION OF NUCLEIC ACID BINDING PROTEINS 
FROM VERTEBRATE CELL EXPRESSION SYSTEMS AND APPLICATIONS THEREOF 



Abstract Text (1) : 

CHG DATE=20030603 STATUS=0>A general vertebrate cloning and screening process for 
identification of genes encoding nucleic acid binding proteins is disclosed, 
including steps of library division and protein expression, followed by a nucleic 
acid binding assay as a clone screening step that is analyzed by either an 
Electrophoretic Mobility Shift Assay (EMSA) or a chromatography assay, preferably 
in a high throughput assay format. The disclosed technology provides a complete 
clone selection system for genes of DNA and RNA binding proteins expressed in 
vertebrate cells. Included are ways to optimize vertebrate cell transformation or 
transfection conditions by measuring the nucleic acid binding function of an 
expressed control protein. Also included is a high throughput electrophoretic ' 
mobility shift assay (EMSA) for detection of nucleic acid binding proteins, using a 
novel application of thin layer agarose gels for EMSA, combining a high vacuum and 
high temperature blotting technique for agarose gel desiccation with the use of a 
high efficiency transfer matrix (for instance, quaternary aminated nylon membranes) 
for blotting nucleic acid-protein complexes from agarose gels. Compensation for 
sample-to-sample variations in signal-to -noise ratio in clone screening is provided 
by introducing a reference probe to detect binding of a known protein expressed in 
the vertebrate host cell, along with the tester probe with the target sequence for 
which new clones encoding binding proteins are sought. High throughput 
chromatography screening methodology for nucleic acid binding proteins is also 



http://westbrs:9000M^ 4/11/04 



Record List Display 



Page 56 of 60 



disclosed, allowing automation of screening. The disclosed technology includes 
combinatorial applications providing assays for functional gene linkage groups and 
can also be applied in connection with cDN A microarray and protein chip 
technologies . 
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File: EPAB 



Apr 17, 2003 



PUB-NO: WO003030620A2 
DOCUMENT-IDENTIFIER: WO 303062 0 A2 
TITLE: IMAGING MICROARRAYS 

PUBN-DATE: April 17, 2003 



INVENTOR- INFORMATION: 
NAME 

PIPER, JAMES R 



COUNTRY 
GB 



ABSTRACT: 

A method of obtaining a corrected image of a microarray includes acquiring an image 
of a microarray including a target spot, and processing the image to correct for 
background noise and chip misalignment. The method also includes analyzing the 
image to identify a target patch, edit debris, and correct for ratio bias; and 
detecting single copy number variation in the target spot using an objective 
statistical analysis that includes a t-value statistical analysis. The method 
provides statistically robust computational processes for accurately detecting 
genomic variation at the single copy level. 



L3: Entry 48 of 53 



File: EPAB 



Apr 17, 2003 



DOCUMENT-IDENTIFIER: WO 3030620 A2 
TITLE: IMAGING MICROARRAYS 



Abstract Text (1 



A method of obtaining a corrected image of a microarray includes acquiring an image 
of a microarray including a target spot, and processing the image to correct for 
background noise and chip misalignment. The method also includes analyzing the 
image to identify a target patch, edit debris, and correct for ratio bias; and 
detecting single copy number variation in the target spot using an objective 
statistical analysis that includes a t-value statistical analysis. The method 
provides statistically robust computational processes for accurately detecting 
genomic variation at the single copy level. 
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□ 49. Document ID: WO 209121 1 Al 

L3: Entry 49 of 53 File: EPAB Nov 14, 2002 

PUB-NO: WO002091211A1 
DOCUMENT-IDENTIFIER: WO 2091211 Al 

TITLE: KERNELS AND METHODS FOR SELECTING KERNELS FOR USE IN LEARNING MACHINES 

PUBN-DATE: November 14, 2002 

INVENTOR-INFORMATION: 
NAME 

BARTLETT, PETER 
ELLISSEEFF, ANDRE 
SCHOELKOPF, BERNARD 

INT-CL (IPC): G06 F 15/18 



ABSTRACT: 

CHG DATE-20030114 S TATUS =N >Ke r n e 1 s (206) for use in learning machines, such as 
support vector machines, and methods are provided for selection and construction of 
such kernels are controlled by the nature of the data to be analyzed (203) . In 
particular, data which may possess characteristics such as structure, for example 
DNA sequences, documents; graphs, signals, such as ECG signals and microarray 
expression profiles; spectra; images; spatio-temporal data; and relational data, 
and which may possess invariances or noise components that can interfere with the 
ability to accurately extract the desired information. Where structured datasets 
are analyzed, locational kernels are defined to provide measures of similarity 
among data points (210) . The locational kernels are then combined to generate the 
decision function, or kernel. Where invariance transformations or noise is present, 
tangent vectors are defined to identify relationships between the invariance or 
noise and the data points (222). A covariance matrix is formed using the tangent 
vectors, then used in generation of the kernel. 
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Abstract Text (1) : 

CHG DATE=20030114 STATUS-N>Kernels (206) for use in learning machines, such as 
support vector machines, and methods are provided for selection and construction of 
such kernels are controlled by the nature of the data to be analyzed (203). In 
particular, data which may possess characteristics such as structure, for example 
DNA sequences, documents; graphs, signals, such as ECG signals and microarray 
expression profiles; spectra; images; spatio-temporal data; and relational data, 
and which may possess invariances or noise components that can interfere with the 
ability to accurately extract the desired information. Where structured datasets 
are analyzed, locational kernels are defined to provide measures of similarity 
among data points (210). The locational kernels are then combined to generate the 
decision function, or kernel. Where invariance transformations or noise is present, 
tangent vectors are defined to identify relationships between the invariance or 
noise and the data points (222). A covariance matrix is formed using the tangent 
vectors, then used in generation of the kernel. 
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ABSTRACTED-PUB-NO: W02 0031067 1 OA 
BASIC-ABSTRACT: 

NOVELTY - Correcting outliers on an oligonucleotide microarray using principal 
component analysis (PCA), involves constructing a correlation structure model 
indicating a correlation structure between probes of the oligonucleotide microarray 
data by use of the PCA, and correcting a faulty probe data by projecting the 
correlation structure model to the outlier. 



DETAILED DESCRIPTION - Correcting (Ml) outliers on an oligonucleotide microarray 
using principal component analysis (PCA) , involves constructing a correlation 
structure model indicating a correlation structure between probes of the 
oligonucleotide microarray data by use of the PCA or detecting an outlier in the 
oligonucleotide microarray data, constructing a first correlation structure model 
for model data to be used to correct the outlier by use of PCA, and correcting a 
faulty probe data by projecting the correlation structure model to the outlier. 

INDEPENDENT CLAIMS are also included for the following: 

(1) a computer-readable medium including a program containing computer-executable 
instructions to perform (Ml); 

(2) a correction apparatus (I) of an oligonucleotide microarray, comprising a 
correlation structure model generator for constructing a correlation structure 
model indicating a correlation structure between probes of the oligonucleotide 
microarray data by use of the PCA or an outlier extractor for detecting outlier 
from the oligonucleotide microarray data, a first correlation structure model 
generator for constructing a first correlation structure model for model data to be 
used to correct the outlier by use of the PCA, and a data corrector for correcting 
a faulty probe data by projecting the first correlation structure model to the 
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outlier; and 

(3) a program being executed in a digital processing device controls operations of 
the digital processing device to perform (Ml) . 

USE - (Ml) is useful for detecting and correcting the faulty probe expression data 
in the outliers of the oligonucleotide microarray by applying PCA to each probe set 
of gene (claimed) . (Ml) is useful for preserving the biological characteristics of 
the raw data. 

ADVANTAGE - (Ml) efficiently detects and corrects the faulty probe expression data 
in the outliers of the oligonucleotide microarray by applying PCA to each probe set 
of gene and effectively removes the noise included in the oligonucleotide 
microarray . (Ml) improves the accuracy and efficiency of chip experiment and 
analysis due to obtainment of accurate expression intensity data, and standardize 
the oligonucleotide chip data. 

DESCRIPTION OF DRAWING ( S ) - The figure shows a flowchart of correcting outliers on 
the nucleotide microarray using principal component analysis (PCA) . 
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TITLE: Correcting outlier on oligonucleotide microarray by principal component 
analysis, by making correlation structure model denoting correlation structure of 
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Basic Abstract Text (8) : 

ADVANTAGE - (Ml) efficiently detects and corrects the faulty probe expression data 
in the outliers of the oligonucleotide microarray by applying PCA to each probe set 
of gene and effectively removes the noise included in the oligonucleotide 
microarray . (Ml) improves the accuracy and efficiency of chip experiment and 
analysis due to obtainment of accurate expression intensity data, and standardize 
the oligonucleotide chip data. 
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